Using DataFrame.sort_values() method in Pandas (5 examples)

Updated: February 20, 2024 By: Guest Contributor Post a comment

Introduction

Pandas is a versatile and widely-used Python library for data manipulation and analysis. One of the core functionalities it offers is the ability to sort data within DataFrames. In this tutorial, we’ll explore how to use the sort_values() method in Pandas, illustrated with five practical examples. By the end of this tutorial, you should be comfortable applying this method to sort your data according to specific requirements.

What is sort_values()?

The sort_values() method in Pandas is used to sort a DataFrame by the values of one or more columns. It is highly flexible, allowing for both ascending and descending, as well as sorting by multiple columns.

Basic Example

Let’s begin with a straightforward example to sort a DataFrame based on a single column:

import pandas as pd

# Sample DataFrame
data = {'Name': ['Alice', 'Bob', 'David', 'Carla'],
    'Age': [24, 42, 35, 28]}
df = pd.DataFrame(data)

# Sorting by age
sorted_df = df.sort_values(by='Age')
print(sorted_df)

This will output:

Alice    24
Carla    28
David    35
Bob      42

Sorting in Descending Order

Now, let’s look at how to sort the DataFrame in descending order:

sorted_df = df.sort_values(by='Age', ascending=False)
print(sorted_df)

Output:

Bob      42
David    35
Carla    28
Alice    24

Sorting by Multiple Columns

Sorting by more than one column can introduce a nuanced hierarchy where the primary column is sorted first and the secondary column next. Here’s how:

data = {'Name': ['Alice', 'Bob', 'David', 'Carla'],
    'Age': [24, 28, 35, 28],
    'Score': [88, 92, 67, 95]}
df = pd.DataFrame(data)

# Sorting by Age then Score
sorted_df = df.sort_values(by=['Age', 'Score'])
print(sorted_df)

Output:

Alice    24
Carla    28
Alice    28
David    35

Using the inplace Parameter

The inplace parameter allows you to modify the DataFrame in place, without having to assign the result to a new variable:

df.sort_values(by='Age', inplace=True)
print(df)

Sorting with a Custom Comparator

For more advanced scenarios, you may want to sort your data using a custom comparator function. This can be achieved by using the key parameter available in Pandas 1.1.0 and later versions. Here’s an example:

data = {'Name': ['Alice', 'Bob', 'David', 'Carla'],
    'Age': [24, 28, 35, 23],
    'City': ['NY', 'LA', 'NY', 'LA']}
df = pd.DataFrame(data)

# Create custom comparator function
def custom_sort(row):
    return len(row)

# Sort by City name length
sorted_df = df.sort_values(by='City', key=custom_sort)
print(sorted_df)

Output:

David    35    NY
Alice    24    NY
Bob      42    LA
Carla    28    LA

Conclusion

Sorting data is a foundational aspect of data analysis and Pandas’ sort_values() method provides a powerful mechanism to perform these operations. With the ability to sort by both single and multiple columns, control the sorting order, and even use custom comparator functions, it equips you with everything needed to precisely manage the order of your data. Implementing these examples in your projects can streamline your data analysis process and reveal insights more effectively.