Sling Academy
Home/Pandas/Pandas: How to get N smallest elements of a Series

Pandas: How to get N smallest elements of a Series

Last updated: February 18, 2024

Overview

In data analysis, extracting specific parts of your data is crucial for deep insights. This is especially true with large datasets where you may only be interested in the smallest values for comparative, statistical, or ranking purposes. Pandas, a powerful and widely used Python library for data manipulation and analysis, provides intuitive methods for such operations. Among these is the capability to easily retrieve the N smallest elements from a Series. This tutorial will guide you through various scenarios and methods for accomplishing this task, enriching your data manipulation toolbox.

Preparation

Before diving into extracting elements, it’s important to understand the basics of Pandas Series. A Series is a one-dimensional labeled array capable of holding any data type. It’s one of the core data structures in Pandas. You can create a Series from a list, array, or a Python dictionary. Understanding Series is fundamental for effectively using Pandas for data manipulation.

Creating a Simple Series

import pandas as pd

# Creating a simple Series from a list
s = pd.Series([20, 35, 10, 15, 30, 45])
print(s)

The output will look something like this:

0    20
1    35
2    10
3    15
4    30
5    45
dtype: int64

Retrieving N Smallest Elements

Now, let’s learn how to get the N smallest elements from a Series. Pandas offers a straightforward method called nsmallest(). This method returns the smallest N elements from the Series, sorted in ascending order.

Basic Example

print(s.nsmallest(3))

The output will be:

2    10
3    15
4    30
dtype: int64

Understanding the nsmallest() Method

The nsmallest() method is not only easy to use but also offers various parameters to customize the operation. The most important parameter is the ‘n’ parameter, which specifies the number of smallest elements to retrieve. However, it also accepts a ‘keep’ parameter to decide how to handle ties. Let us explore how this works in different scenarios.

Handling Ties with the keep Parameter

import pandas as pd

# Consider a Series with ties
s = pd.Series([20, 10, 10, 15, 30, 45, 10])

# Using keep='first' (default)
print(s.nsmallest(3))

# Using keep='last'
print(s.nsmallest(3, keep='last'))

# Using keep='all'
print(s.nsmallest(3, keep='all'))

The above examples will respectively output:

1    10
2    10
6    10
dtype: int64

2    10
6    10
1    10
dtype: int64

1    10
2    10
6    10
dtype: int64

Advanced Usage

For more advanced data manipulation needs, you might want to explore using the nsmallest() method on DataFrames or applying it after filtering or transforming your Series. Here, the concepts of indexing, boolean masking, and function chaining come into play, allowing for more sophisticated data analyses.

Applying nsmallest() to a DataFrame

import pandas as pd

# Creating a DataFrame

# Getting the N smallest values from a specific column

Conclusion

Retrieving the N smallest elements from a Series in Pandas is not only incredibly useful but also incredibly easy, thanks to the nsmallest() method. Whether you’re dealing with basic or more complex datasets, this method offers the flexibility and power to perform this operation efficiently. Empowering your data analysis and manipulation skills begins with mastering these basic yet essential tasks.

Next Article: Understanding pandas.Series.pct_change() method (with examples)

Previous Article: Pandas: How to get N largest values of a Series

Series: Pandas Series: From Basic to Advanced

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)