Sling Academy
Home/Pandas/Pandas + FastAPI: How to serve a DataFrame as a REST API (with pagination)

Pandas + FastAPI: How to serve a DataFrame as a REST API (with pagination)

Last updated: February 19, 2024

Overview

FastAPI and Pandas together enable Python developers to build powerful REST APIs that can handle data efficiently. This tutorial will guide you through setting up a project that uses Pandas DataFrame with FastAPI to serve a RESTful API, including handling pagination to efficiently serve large datasets.

Introduction to FastAPI and Pandas

FastAPI is a modern, fast web framework for building APIs with Python 3.6+ based on standard Python type hints. Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language. By combining both, you can quickly develop data-driven APIs.

Setting Up Your Environment

Ensure you have Python 3.6+ and pip installed. You’ll need to install FastAPI and Uvicorn with:

pip install fastapi uvicorn pandas

Creating a Basic FastAPI App with a Pandas DataFrame

Let’s start by creating a basic API that uses a Pandas DataFrame. You’ll need to import FastAPI and Pandas, then initialize your app and DataFrame.

from fastapi import FastAPI
import pandas as pd

app = FastAPI()

data = {'Name': ['John', 'Emily', 'Charles', 'Diana'],
        'Age': [28, 22, 35, 46]}
df = pd.DataFrame(data)

@app.get("/data")
def read_data():
    return df.to_dict(orient='records')

Launch the app with uvicorn main:app --reload. You should now be able to access your API at http://127.0.0.1:8000/data and see the DataFrame converted to a list of dictionaries.

Adding Pagination

Pagination is crucial for efficiently serving large datasets. Here’s how to implement it in your FastAPI app.

from fastapi import FastAPI, Query
import pandas as pd

app = FastAPI()

data = {...}  # Your DataFrame data

df = pd.DataFrame(data)

@app.get("/data")
def read_data(page: int = Query(1, alias="page"), limit: int = Query(10, alias="limit")):
    start = (page - 1) * limit
    end = start + limit
    return df.iloc[start:end].to_dict(orient='records')

This code will allow clients to request specific pages of your dataset, using query parameters like ?page=3&limit=5.

More Sample Data

You can manually create CSV data (as in the preceding code examples) or download one of the following datasets to practice:

Advanced Techniques

For more complex datasets or additional functionality, consider these enhancements:

  • Filtering: Allow users to filter data based on specific fields.
  • Sorting: Enable sorting of the DataFrame before serving.
  • Aggregation: Apply statistical or aggregation operations to the data before serving.

Conclusion

Combining Pandas with FastAPI to serve a DataFrame as a REST API, particularly with pagination, provides a powerful solution for data-driven projects. By starting with the basics and gradually incorporating advanced features, developers can craft efficient, flexible, and maintainable APIs.

Next Article: Pandas: How to read an XML file into a DataFrame

Previous Article: Pandas + Jinja: How to render a DataFrame as an HTML table

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)