Pandas: How to read and update Google Sheet files (2 examples)

Updated: February 22, 2024 By: Guest Contributor Post a comment

Introduction

Google Sheets has become an indispensable tool for data storage and sharing in a collaborative way. Integrating Google Sheets with Python through Pandas can significantly streamline the process of reading, analyzing, and updating data programmatically. This tutorial will guide you through the process of using Pandas to read and update Google Sheets files, illustrated with two practical examples.

Prerequisites

  • Python 3.x installed
  • Pandas library installed
  • Google account
  • Access to Google Sheets API
  • Google Client Library installed

Setting up the Google Sheets API

Before diving into the code, you need to set up the Google Sheets API and create a project in the Google Developers Console:

  1. Go to the Google Developers Console, create a new project, and name it.
  2. Search for the Google Sheets API, enable it for your project, and create credentials (service account key).
  3. Download the JSON file containing your credentials and keep it safe as you will need it to authenticate your requests.
  4. Share your Google sheet with the email address provided in your JSON credentials, giving it editor rights.

Example 1: Reading Google Sheets Data into Pandas DataFrame

Step 1: Install the necessary libraries by running:

pip install --upgrade google-auth-oauthlib google-auth-httplib2 pandas gspread

Step 2: Authenticate and Connect to the Google Sheets:

import pandas as pd
import gspread
from google.oauth2.service_account import Credentials

# Define the scope
scope = [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive'
]

# Add credentials
creds = Credentials.from_service_account_file('path/to/your/credentials.json', scopes=scope)

# Authenticate and create the client
client = gspread.authorize(creds)

# Open the spreadsheet
sheet = client.open('Your Sheet Name').sheet1

Step 3: Read data into a Pandas DataFrame:

data = sheet.get_all_records()
df = pd.DataFrame(data)
print(df)

This simple sequence of instructions allows you to load data from a Google Sheet into a pandas DataFrame, making it readily available for analysis, visualization, or further processing.

Example 2: Updating Google Sheets from Pandas DataFrame

Once your data is processed and you want to update or append to your Google Sheet, follow these steps:

Step 1: Assuming you have a DataFrame df you wish to upload:

import pandas as pd
import gspread
df = pd.DataFrame({
  'Column1': [1, 2, 3],
  'Column2': ['a', 'b', 'c']
})

Step 2: Authenticate and connect to Google Sheets (if not already done as per Example 1).

Step 3: Update your sheet:

sheet.update([df.columns.values.tolist()] + df.values.tolist())

This will replace the entire content of your sheet with the DataFrame. If you only need to append data, use sheet.append_rows(df.values.tolist()) instead.

Conclusion

Integrating Pandas with Google Sheets offers powerful capabilities for data manipulation and analysis. By following these examples, you should now be equipped to read from and write to Google Sheets using Python and Pandas. Don’t forget that the scope of what you can achieve extends much further, with Pandas providing extensive functionalities for data analysis and manipulation. The combination of these technologies opens up a wide array of possibilities for automating and streamlining data handling tasks.

Keep exploring and experimenting with different functions both in Pandas and the Google Sheets API to fully leverage the potential of this integration.