Pandas: Generate fixed frequency DatetimeIndex with business day

Updated: February 21, 2024 By: Guest Contributor Post a comment

Overview

Pandas is a powerful data manipulation and analysis library for Python, widely used in the field of data science and analytics. Among its numerous functionalities, it provides robust tools for time series data manipulation. A common requirement in financial analysis and other business applications is to generate a range of dates restricted to business days, ignoring weekends and possibly holidays. In this tutorial, we will explore how to generate a fixed frequency DatetimeIndex with business days using Pandas.

Introduction to DatetimeIndex

DatetimeIndex is a type of index used in Pandas to handle datetime objects. It is crucial for time series analysis as it allows for easy slicing, filtering, and aggregation operations based on time intervals.

Creating a Basic DatetimeIndex

To start, let’s create a simple DatetimeIndex. This will not yet filter for business days but is a starting point for understanding how to work with date ranges in Pandas.

import pandas as pd
pd.date_range(start='2023-01-01', end='2023-01-07')

This example generates a DatetimeIndex from January 1, 2023, to January 7, 2023. The output will be a range of consecutive dates, including the weekend:

DatetimeIndex(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-05', '2023-01-06', '2023-01-07'], dtype='datetime64[ns]', freq='D')

Generating Business Days with Pandas

Now that we are familiar with the basics of generating date ranges with Pandas, let’s move on to generating ranges that only include business days. Pandas provides the Bday frequency alias for this purpose.

Example 1: Basic Business Days

Here’s how to generate a range that includes only business days, excluding weekends:

import pandas as pd
pd.date_range(start='2023-01-01', end='2023-01-31', freq='B')

The output will be a DatetimeIndex that includes only the weekdays within the indicated range:

DatetimeIndex(['2023-01-02', '2023-01-03', '2023-01-04', '2023-01-05', '2023-01-06', '2023-01-09', '2023-01-10', '2023-01-11', '2023-01-12', '2023-01-13', '2023-01-16', '2023-01-17', '2023-01-18', '2023-01-19', '2023-01-20', '2023-01-23', '2023-01-24', '2023-01-25', '2023-01-26', '2023-01-27', '2023-01-30', '2023-01-31'], dtype='datetime64[ns]', freq='B')

Example 2: Custom Business Days

Pandas allows for further customization, such as defining custom business days that can exclude specific holidays. This is done by creating a CustomBusinessDay object.

To illustrate, let’s create a DatetimeIndex that includes a set range but excludes certain holidays. For this example, we will consider New Year’s Day and Independence Day as holidays.

from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay

holidays = ['2023-01-01', '2023-07-04']
custom_bday = CustomBusinessDay(holidays=holidays)

pd.date_range(start='2023-01-01', end='2023-07-10', freq=custom_bday)

The resulting DatetimeIndex will cover the range but exclude January 1st and July 4th:

DatetimeIndex(['2023-01-02', '2023-01-03', '2023-01-04', '2023-01-05', '2023-01-06', '2023-01-09', '2023-01-10', '2023-01-11', '2023-01-12', '2023-01-13', '2023-01-16', '2023-01-17', '2023-01-18', '2023-01-19', '2023-01-20', '2023-01-23', '2023-01-24', '2023-01-30', '2023-01-31', '2023-07-03', '2023-07-05', '2023-07-06', '2023-07-07', '2023-07-10'], dtype='datetime64[ns]', freq='C')

Advanced Usage: Modifying Business Day Frequency

Pandas’ flexibility doesn’t end with simple date ranges. You can also modify the frequency of business days, for example, to have a date range where the business day occurs every other day.

Example: Every Other Business Day

By modifying the frequency parameter within pd.date_range, you can easily create a range that follows a less conventional pattern. Here’s an example:

import pandas as pd
pd.date_range(start='2023-01-01', end='2023-01-31', freq='2B')

This range will include every second business day:

DatetimeIndex(['2023-01-02', '2023-01-04', '2023-01-06', '2023-01-10', '2023-01-12', '2023-01-16', '2023-01-18', '2023-01-20', '2023-01-24', '2023-01-26', '2023-01-30'], dtype='datetime64[ns]', freq='2B')

Conclusion

Throughout this tutorial, we explored various ways to generate and customize business day ranges using Pandas. From simple weekday ranges to complex schedules excluding holidays and custom frequencies, Pandas provides powerful capabilities for working with time series data in a business context. These tools can greatly aid in the analysis and presentation of time-sensitive data, improving the efficiency and accuracy of data-driven decisions.