Pandas to_timedelta() function: Explained with examples

Updated: February 23, 2024 By: Guest Contributor Post a comment

Introduction

The to_timedelta() function in Pandas is a powerful tool for converting scalar, array, list, or series from a recognized timedelta format/representation to a Timedelta type. Timedelta represents the difference in times in terms of days, hours, minutes, seconds, and even milliseconds. This feature is especially useful in time series and financial analysis where precise time calculations are crucial for decision-making. In this tutorial, we’ll explore the to_timedelta() function through a series of examples, starting from the basics and moving towards more complex applications.

Syntax & Parameters

The to_timedelta() function is used to convert a variety of input formats (such as strings, integers, floats, list, or Series) into Timedelta type. Its basic syntax is as follows:

Pandas.to_timedelta(arg, unit='ns', errors='raise')
  • arg: the input to be converted. Can be a list, scalar, array, or Series.
  • unit: the unit of the arg (default is nanoseconds, ‘ns’). Other units include ‘days’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’, ‘microseconds’.
  • errors: method of handling errors. ‘raise’ will raise an error for invalid inputs, ‘coerce’ will convert invalid inputs to NaT (not a time), and ‘ignore’ will return the input if it can’t be converted.

Basic Examples

Let’s start with the most simple application, converting a single string:

import pandas as pd

# Convert a string to timedelta
result = pd.to_timedelta('1 days 06:05:01.00003')
print(result)

Output:

Timedelta('1 days 06:05:01.000030')

This code converts the string ‘1 days 06:05:01.00003’ into a Timedelta object.

Converting Lists and Series

Now, let’s convert a list of time strings into Timedelta objects:

import pandas as pd

# List of time strings
list_of_times = ['1 days', '2 hours', '30 minutes']

# Convert list to Timedelta objects
result = pd.to_timedelta(list_of_times)
print(result)

Output:

TimedeltaIndex(['1 days 00:00:00', '0 days 02:00:00', '0 days 00:30:00'], dtype='timedelta64[ns]', freq=None)

This converts a list of various time strings into a TimedeltaIndex, which can be used for time calculations and manipulations.

Applying Unit Parameter

Specifying units allows for the conversion of integers or floats into Timedeltas. This is useful for cases where time durations are not explicitly stated in a standard timedelta format:

import pandas as pd

# Convert integer to Timedelta with 'hours' unit
result = pd.to_timedelta(5, unit='h')
print(result)

Output:

Timedelta('0 days 05:00:00')

In this example, the integer 5 is converted into a Timedelta representing 5 hours.

Handling Errors

Understanding how to handle errors is crucial while using to_timedelta(). Let’s see an example of using the ‘coerce’ option:

import pandas as pd

# Example of error handling with 'coerce'
result = pd.to_timedelta('Not a time string', errors='coerce')
print(result)

Output:

NaT

In cases where input cannot be converted to a Timedelta, using ‘coerce’ returns NaT, avoiding a crash in the program due to invalid inputs.

Advanced Examples

Moving on to more sophisticated uses of to_timedelta(), let’s explore arithmetic operations on Timedelta objects:

import pandas as pd

# Converting strings to Timedelta
start = pd.to_timedelta('1 days')
end = pd.to_timedelta('2 days')

# Arithmetic operations
duration = end - start
print(duration)

Output:

1 days 00:00:00

This example demonstrates that we can perform arithmetic operations between Timedeltas, such as calculating the duration between two time periods.

Application in DataFrames

Finally, let’s look at an application of to_timedelta() in a DataFrame context, where we calculate the duration of events:

import pandas as pd

df = pd.DataFrame({
    'Start': ['2021-01-01 08:00:00', '2021-01-02 08:00:00'],
    'End': ['2021-01-01 09:00:00', '2021-01-02 10:30:00']
})
df['Start'] = pd.to_datetime(df['Start'])
df['End'] = pd.to_datetime(df['End'])
df['Duration'] = df['End'] - df['Start']
print(df)

Output:

                Start                 End        Duration
0 2021-01-01 08:00:00 2021-01-01 09:00:00 0 days 01:00:00
1 2021-01-02 08:00:00 2021-01-02 10:30:00 0 days 02:30:00

In this DataFrame example, we calculate the duration between start and end times of events, showcasing the application of Timedelta in a practical context.

Conclusion

The to_timedelta() function in Pandas is an incredibly versatile tool for time-based calculations. From converting strings to performing arithmetic operations and handling errors gracefully, its usefulness spans a wide range of applications. By mastering to_timedelta(), you can unlock the full potential of time series data in your analysis.