12.5: Sorting Arrays
A common operation in Data Science is to sort an array, either numerically (if the array contains ints or floats ) or alphabetically (if strings). There are two ways to do this, which turn out to differ in the same way as the operations in the previous section.
One way is to call the .sort() method directly on an array. This sorts the array in place , which means that the actual data in memory is rearranged right then and there. As an important side effect, any other variable that points to the same array will also be sorted.
Here’s an example:
Code \(\PageIndex{1}\) (Python):
gpas = np.array([2.86, 3.99, 3.12, 1.17])
gpas2 = gpas.copy()
gpas3 = gpas
gpas.sort()
print("gpas has: {}".format(gpas))
print("gpas2 has: {}".format(gpas2))
print("gpas3 has: {}".format(gpas3))
| gpas has: [1.17 2.86 3.12 3.99]
| gpas2 has: [2.86 3.99 3.12 1.17]
| gpas3 has: [1.17 2.86 3.12 3.99]
Do you see why that output was produced? It’s because the memory picture after the “ gpas.sort() ” line looks like Figure 9.5.1. The gpas variable really is the gpas3 variable, so when one is sorted, the other automatically is. They’re both distinct from gpas2 , though.
Figure \(\PageIndex{1}\): The state of affairs after .sort() i ng the gpas array in place.
The second option is to call the np.sort() function and pass an array as an object. Like many Python functions, including the ones in the next section, np.sort() returns a modified copy of its argument rather than changing it in place. To illustrate:
Code \(\PageIndex{2}\) (Python):
nfl_teams = np.array(["Ravens", "Patriots", "Broncos", "Chargers", "Steelers"])
sorted_teams = np.sort(nfl_teams)
print(nfl_teams)
print(sorted_teams)
| ['Ravens' 'Patriots' 'Broncos' 'Chargers' 'Steelers']
| ['Broncos' 'Chargers' 'Patriots' 'Ravens' 'Steelers']
Observe that the nfl_teams variable, even though we passed it to np.sort() , was not itself sorted. The sorted_teams variable, on the other hand, is alphabetically sorted, because we assigned the return value from np.sort() to it. Again, the memory picture is shown in Figure 9.5.2.
Figure \(\PageIndex{2}\): Calling the np.sort() function (as opposed to calling the .sort() method on the array) returns a sorted copy.
To be clear, either one of these techniques can be used on any ndarray : whole numbers, real numbers, or text. I just chose to do real numbers in the first example and text in the second. The difference between the two is merely in what is affected: in one, the actual array in memory is modified, and in the other, a modified copy is returned.