Pandas: Generate a Time Series between 2 Given Dates

Updated: February 18, 2024 By: Guest Contributor Post a comment

Introduction

When working with time series data in Python, generating sequences of dates can be an essential task for various applications such as financial analyses, weather forecasting, or even for setting up calendars for events. In this tutorial, we will explore how to use the Pandas library to generate a series of dates between two specified dates. Pandas, well-known for its powerful data manipulation capabilities, provides a straightforward approach for this purpose.

Getting Started

First, ensure you have Pandas installed. If not, you can install it using pip:

pip install pandas

With Pandas installed, you can now start by importing it in your Python script or Jupyter notebook.

import pandas as pd

Generating a Basic Time Series

To generate a basic time series between two dates, you can use the pd.date_range() function. Let’s say you want to generate a list of dates from January 1, 2021, to January 10, 2021.

start_date = '2021-01-01'
end_date = '2021-01-10'
dates = pd.date_range(start=start_date, end=end_date)
print(dates)

This code will output a DatetimeIndex containing the dates within the specified range.

Customizing Frequency

Pandas allows customization of the frequency of generated dates. For example, if you wish to generate a series of weekly dates, you can specify the freq parameter as ‘W’. Let’s create a series of Sundays between two dates.

dates = pd.date_range(start='2021-01-01', end='2021-03-01', freq='W-SUN')
print(dates)

This command produces a DatetimeIndex with dates corresponding to every Sunday within the specified range.

Including Time

You can also include time in your range. If you’re generating a series for a single day but at different times, you can set the frequency to minutes or hours, for instance. Creating a sequence for January 1, 2021, at hourly intervals can be done as follows:

dates = pd.date_range(start='2021-01-01', end='2021-01-01 23:59:59', freq='H')
print(dates)

This will generate datetime values for every hour on January 1, 2021.

Inclusive End Date

In versions earlier than Pandas 1.1.0, the end date was exclusive, meaning you had to extend your end date to include it in your range. However, starting from Pandas 1.1.0, you can use the inclusive='both' parameter to include both the start and end dates in your generated series.

dates = pd.date_range(start='2021-01-01', end='2021-01-10', inclusive='both')
print(dates)

Advanced Use Cases

For more complex scenarios, Pandas offers additional parameters such as specifying holidays to exclude, setting a custom calendar, or generating ranges based on business days. Another example is generating a sequence that excludes weekends.

from pandas.tseries.offsets import BDay
dates = pd.date_range(start='2021-01-01', end='2021-01-31', freq=BDay())
print(dates)

This generates a DatetimeIndex excluding Saturdays and Sundays, representing the typical business days in a month.

Generating Series with Periods

You can also generate a sequence by specifying the number of periods instead of an end date. This can be particularly useful when you know how many days you need but not the exact end date.

dates = pd.date_range(start='2021-01-01', freq='D', periods=10)
print(dates)

This command will generate the first 10 days of January 2021.

Conclusion

Throughout this tutorial, we have seen various ways to generate date ranges using the Pandas library. Whether your use case is simple or complex, Pandas provides a robust set of tools for efficiently creating and manipulating time series data. Understanding these capabilities allows for more flexible data analysis and preprocessing tasks, catering to a wide range of applications.