Pandas DataFrame.truncate() method: Explained with examples

Updated: February 24, 2024 By: Guest Contributor Post a comment

Introduction

The DataFrame.truncate() method in Pandas is a handy function for slicing portions of DataFrames or Series between specified dates or between particular row/column numbers. It can be particularly useful in time series analysis or when working with large datasets where you need to focus on specific intervals. This tutorial will walk you through the basics to more advanced use cases of the truncate() method, complete with examples to help you understand how to incorporate this function into your data manipulation toolkit.

Syntax & Parameters

The truncate() method is used to truncate a Series or DataFrame before and after some index values. This is mainly used for slicing time series data, but it can also work with generic index spans. Its syntax is quite straightforward:

Pandas.DataFrame.truncate(before=None, after=None, axis=None, copy=True)

Where:

  • before and after denote the truncation limits.
  • axis specifies the truncation direction (0 for rows and 1 for columns).
  • copy indicates whether to return a copy of the truncated data or perform the operation in-place.

Basic Example

Let’s start with a basic example, where we have a DataFrame representing daily sales of a store:

import pandas as pd
rom datetime import datetime

df = pd.DataFrame({
  'date': pd.date_range(start='2023-01-01', periods=10),
  'sales': [234, 456, 324, 456, 678, 234, 590, 789, 456, 123]
})

d f.set_index('date', inplace=True)
print(df)

The output will be:

            sales
2023-01-01    234
2023-01-02    456
2023-01-03    324
...           ...
2023-01-09    456
2023-01-10    123

To truncate this DataFrame to only include sales from January 3 to January 8,

df_truncated = df.truncate(before='2023-01-03', after='2023-01-08')
print(df_truncated)

The truncated DataFrame:

            sales
2023-01-03    324
2023-01-04    456
...           ...
2023-01-08    789

Truncating Columns

Truncation is not limited to rows. You can also truncate columns by specifying the axis=1 argument. Consider a DataFrame with multiple columns:

df = pd.DataFrame({
  'A': range(1, 11),
  'B': range(11, 21),
  'C': range(21, 31),
  'D': range(31, 41)
})

df_truncated = df.truncate(before='B', after='C', axis=1)
print(df_truncated)

The resulting DataFrame will show only columns B and C:

    B   C
0  11  21
1  12  22
... ...
8  19  29
9  20  30

Working with Time Series Data

When dealing with time series data, the truncate() method becomes exceptionally powerful. For datasets with DatetimeIndex, you can precisely cut the dataset to your time window of interest. Let’s work on a more complex dataset, a time series of hourly temperature readings:

temperature_df = pd.DataFrame({
  'datetime': pd.date_range(start='2023-01-01', periods=24, freq='H'),
  'temperature': [22, 21, 23, 24, 22, 20, 19, 18, 17, 22, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 28, 27, 26, 25]
})

temperature_df.set_index('datetime', inplace=True)

df_truncated = temperature_df.truncate(before='2023-01-01 09:00:00', after='2023-01-01 17:00:00')
print(df_truncated)

This results in a DataFrame that contains temperature readings from 9 AM to 5 PM:

                     temperature
2023-01-01 09:00:00           17
2023-01-01 10:00:00           22
...                            ...
2023-01-01 17:00:00           27

Advanced Uses: Truncating Based on Custom Indices

In addition to regular numeric and date indices, truncate() can also be applied to DataFrames with custom index types. Suppose you have a DataFrame indexed by some category with an inherent order, such as business stages in a pipeline. You can truncate this DataFrame to focus on a particular stage range:

df = pd.DataFrame({
  'stage': ['Lead', 'Opportunity', 'Negotiation', 'Closure'],
  'value': [345, 810, 675, 935]
}).set_index('stage')

df_truncated = df.truncate( before='Opportunity', after='Negotiation')
print(df_truncated)

This will output:

             value
Opportunity   810
Negotiation   675

Conclusion

Through this guide, we have seen how the truncate() method in Pandas can be a powerful tool for data slicing, especially when working with time series data. The examples provided here span from the basic to more advanced applications, showcasing its flexibility and utility across different types of data. Armed with truncate(), you’re now better equipped to handle data slicing tasks with precision.