Pandas + FastAPI: How to serve a DataFrame as a REST API (with pagination)

Updated: February 19, 2024 By: Guest Contributor Post a comment

Overview

FastAPI and Pandas together enable Python developers to build powerful REST APIs that can handle data efficiently. This tutorial will guide you through setting up a project that uses Pandas DataFrame with FastAPI to serve a RESTful API, including handling pagination to efficiently serve large datasets.

Introduction to FastAPI and Pandas

FastAPI is a modern, fast web framework for building APIs with Python 3.6+ based on standard Python type hints. Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language. By combining both, you can quickly develop data-driven APIs.

Setting Up Your Environment

Ensure you have Python 3.6+ and pip installed. You’ll need to install FastAPI and Uvicorn with:

pip install fastapi uvicorn pandas

Creating a Basic FastAPI App with a Pandas DataFrame

Let’s start by creating a basic API that uses a Pandas DataFrame. You’ll need to import FastAPI and Pandas, then initialize your app and DataFrame.

from fastapi import FastAPI
import pandas as pd

app = FastAPI()

data = {'Name': ['John', 'Emily', 'Charles', 'Diana'],
        'Age': [28, 22, 35, 46]}
df = pd.DataFrame(data)

@app.get("/data")
def read_data():
    return df.to_dict(orient='records')

Launch the app with uvicorn main:app --reload. You should now be able to access your API at http://127.0.0.1:8000/data and see the DataFrame converted to a list of dictionaries.

Adding Pagination

Pagination is crucial for efficiently serving large datasets. Here’s how to implement it in your FastAPI app.

from fastapi import FastAPI, Query
import pandas as pd

app = FastAPI()

data = {...}  # Your DataFrame data

df = pd.DataFrame(data)

@app.get("/data")
def read_data(page: int = Query(1, alias="page"), limit: int = Query(10, alias="limit")):
    start = (page - 1) * limit
    end = start + limit
    return df.iloc[start:end].to_dict(orient='records')

This code will allow clients to request specific pages of your dataset, using query parameters like ?page=3&limit=5.

More Sample Data

You can manually create CSV data (as in the preceding code examples) or download one of the following datasets to practice:

Advanced Techniques

For more complex datasets or additional functionality, consider these enhancements:

  • Filtering: Allow users to filter data based on specific fields.
  • Sorting: Enable sorting of the DataFrame before serving.
  • Aggregation: Apply statistical or aggregation operations to the data before serving.

Conclusion

Combining Pandas with FastAPI to serve a DataFrame as a REST API, particularly with pagination, provides a powerful solution for data-driven projects. By starting with the basics and gradually incorporating advanced features, developers can craft efficient, flexible, and maintainable APIs.