How to Sort Arrays in NumPy (Basic & Advanced Techniques)

Updated: January 22, 2024 By: Guest Contributor Post a comment

Introduction

Sorting is a common operation in data analysis and programming. It involves arranging the items in a collection in a specified order. NumPy, a core library for scientific computing in Python, provides several functions to sort arrays efficiently. This guide covers multiple approaches to sorting arrays in NumPy, including basic and advanced techniques.

Simple Sort Using np.sort

The simplest way to sort an array in NumPy is using the np.sort function. This method returns a sorted copy of the input array along the specified axis, without modifying the original array. By default, it sorts in ascending order.

  • Step 1: Import the NumPy library.
  • Step 2: Create an unsorted NumPy array.
  • Step 3: Call the np.sort function on the array.
  • Step 4: Print the sorted array to verify the results.

Example:

import numpy as np

unsorted_array = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])
sorted_array = np.sort(unsorted_array)
print(sorted_array)

Output:

[1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9]

Notes: The np.sort function uses a quicksort algorithm by default, but you can choose other algorithms like mergesort or heapsort by setting the kind parameter. It’s important to know that np.sort produces a new array and does not alter the original one.

In-place Sort with np.ndarray.sort

In contrast to np.sort, the np.ndarray.sort method sorts the NumPy array in-place. This means the original array is modified, and no additional memory is used to create a copy.

  • Step 1: Import the NumPy library.
  • Step 2: Create an unsorted NumPy array.
  • Step 3: Call the sort method on the array object.
  • Step 4: Print the array to verify the changes.

Example:

import numpy as np

unsorted_array = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])
unsorted_array.sort()
print(unsorted_array)

Output:

[1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9]

Notes: Using np.ndarray.sort is efficient when you want to save memory and you do not need to preserve the original order of your array. This method also uses quicksort by default, with mergesort and heapsort as alternatives.

Sorting with Order and Structure: np.argsort and Structured Arrays

Sometimes you want to sort an array and retain the original indices. This is where np.argsort comes in handy. Furthermore, if your array has a compound structure (i.e., fields with different datatypes), you can sort using the order parameter.

  • Step 1: Import the NumPy library.
  • Step 2: Create an unsorted NumPy array.
  • Step 3: Use np.argsort to get the indices that would sort the array.
  • Step 4: Sort the array using the indices from the previous step.
  • Step 5: If using structured arrays, specify the order parameter with the field names you want to sort by.

Example:

import numpy as np

unsorted_array = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])
indices = np.argsort(unsorted_array)
sorted_array = unsorted_array[indices]
print(sorted_array)

# Structured array sorting example
structured_array = np.array([(2, 'Z'), (1, 'X'), (3, 'Y')], 
                           dtype=[('number', int), ('letter', 'S1')])
sorted_structured_array = np.sort(structured_array, order='number')
print(sorted_structured_array)

Output:

[1 1 2 3 3 4 5 5 5 6 9]

[(1, 'X') (2, 'Z') (3, 'Y')]

Notes: np.argsort is useful when you also want to perform the same reordering on another array based on the sorting of the first array. Structured array sorting is beneficial when dealing with complex data. Both methods maintain the quicksort’s time complexity, but the structured array sort requires specifying correct field names.

Partial Sorting: np.partition

When you’re interested in the ‘kth’ smallest values of the array and don’t care about the complete order, np.partition is an optimal solution. The function partitions an array such that the kth element is in the position it would be in a sorted array, and all elements smaller than it are moved before it, while all larger elements are moved behind it.

  • Step 1: Import the NumPy library.
  • Step 2: Create an unsorted NumPy array.
  • Step 3: Decide the ‘kth’ position for the partition.
  • Step 4: Use the np.partition function.
  • Step 5: Print the partial sorted array to verify the placement of the kth element.

Example:

import numpy as np

unsorted_array = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])
k = 5
kth_element_array = np.partition(unsorted_array, k)
print(f'The kth element: {unsorted_array[k]}')
print(kth_element_array)

Output:

The kth element: 9
  [3 1 2 1 3 4 5 6 5 9 5]

Notes: np.partition is faster for finding the top k elements but does not sort the entire array, which is a limitation if a full sort is needed. The complexity is better than a full sort for larger arrays when you need only a few elements sorted.

Conclusion

Sorting is a versatile tool in NumPy that supports various scenarios ranging from simple complete sorts to complex structured data sorts and even partial sorting for performance gains. Depending on your requirements, you can choose the most suitable function. Operations like np.sort and np.ndarray.sort offer full-array sorting, while np.argsort provides sorted indices, and np.partition offers a performance advantage when you only need to know a subset of sorted elements.