Sling Academy
Home/Pandas/Pandas: How to access and modify Excel files in One Drive (3 examples)

Pandas: How to access and modify Excel files in One Drive (3 examples)

Last updated: February 22, 2024

Overview

In the age of cloud storage, accessing and modifying Excel files directly from platforms like OneDrive using Python’s Pandas library has become increasingly beneficial, especially for remote teams and automated data processing systems. This tutorial will walk you through three practical examples of how to access and modify Excel files stored in OneDrive using Pandas.

Preparation (Important)

Before we dive into the examples, ensure you have Pandas installed. If you haven’t, you can install it using pip:

pip install pandas

The first step is to access the Excel file stored in OneDrive. For this, we will use Microsoft Graph API, which requires authentication. Therefore, you need to register your application with Azure Active Directory (Azure AD) to obtain the necessary access tokens. Visit the Azure Portal, register your app, and get your client ID, secret, and tenant ID.

Example 1: Authenticating and Reading an Excel File

In this example, we’ll see how to authenticate using Microsoft Graph API and read an Excel file in OneDrive.

import requests
import pandas as pd
from io import BytesIO

# Your Azure AD app credentials
CLIENT_ID = 'YOUR_CLIENT_ID'
CLIENT_SECRET = 'YOUR_CLIENT_SECRET'
TENANT_ID = 'YOUR_TENANT_ID'

# Construct the login URL
AUTH_URL = f'https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token'
RESOURCE_URL = 'https://graph.microsoft.com/v1.0'
REDIRECT_URI = 'YOUR_REDIRECT_URI'

# Prepare the authentication request
body = {
    'client_id': CLIENT_ID,
    'scope': 'Files.ReadWrite.All offline_access',
    'client_secret': CLIENT_SECRET,
    'grant_type': 'client_credentials'
}

# Request the token
response = requests.post(AUTH_URL, data=body)
token = response.json().get('access_token')

# Use the token to access an Excel file
file_url = f'{RESOURCE_URL}/me/drive/root:/YourExcelFile.xlsx:/content'
headers = {'Authorization': 'Bearer ' + token}
response = requests.get(file_url, headers=headers)

# Load the Excel file into a Pandas DataFrame
excel_file = BytesIO(response.content)
df = pd.read_excel(excel_file)

# Display the DataFrame
df.head()

This code authenticates your application with Azure AD, retrieves the access token, and uses it to access the Excel file. Panda’s read_excel function then loads the file into a DataFrame.

Example 2: Modifying an Excel File

Upon successfully reading an Excel file, you might want to modify its contents. This section demonstrates how to add a new row to an existing Excel file and then save it back to OneDrive.

import openpyxl

# Modify the DataFrame
df = df.append({'Column1': 'New Data', 'Column2': 'More New Data'}, ignore_index=True)

# Save the modified DataFrame to an Excel file
modified_excel = 'ModifiedExcelFile.xlsx'
df.to_excel(modified_excel, index=False)

# Now, upload it back to OneDrive
upload_url = f'{RESOURCE_URL}/me/drive/root:/ModifiedExcelFile.xlsx:/content'
files = {'file': open(modified_excel, 'rb')}
response = requests.put(upload_url, headers=headers, files=files)

# Check if the file was successfully uploaded
if response.status_code == 200:
    print('File successfully uploaded to OneDrive')

This example appends a new row to the DataFrame and then saves it as a new Excel file. Lastly, it uploads the modified file back to OneDrive.

Example 3: Creating a New Excel File

Lastly, let’s look at how to create a new Excel file from scratch and upload it to OneDrive. This method is especially useful for generating reports or exporting data automatically.

import dataframe_image as dfi

# Generate a new DataFrame
data = {'Column1': ['Value1', 'Value2'], 'Column2': ['Value3', 'Value4']}
new_df = pd.DataFrame(data)

# Save the DataFrame as an Excel file
new_excel_file = 'NewExcelFile.xlsx'
new_df.to_excel(new_excel_file, index=False)

# Upload the new Excel file to OneDrive
upload_url = f'{RESOURCE_URL}/me/drive/root:/NewExcelFile.xlsx:/content'
files = {'file': open(new_excel_file, 'rb')}
response = requests.put(upload_url, headers=headers, files=files)

# Verify the upload was successful
if response.status_code == 200:
    print('New Excel file successfully uploaded to OneDrive')

This code creates a new DataFrame, saves it as an Excel file, and then uploads it to OneDrive. Replace ‘Column1’, ‘Column2’, ‘Value1’, etc., with your actual data.

Conclusion

Integrating Pandas with OneDrive for accessing and modifying Excel files opens up a world of possibilities for data analysis and management. Whether you’re automating reports, managing datasets, or collaborating with remote teams, the combination of Pandas, Azure Active Directory, and Microsoft Graph API makes the process efficient and scalable.

Remember, while this tutorial demonstrated basic operations, the potential applications are vast. Explore more advanced features of the Microsoft Graph API and Pandas to fully harness the power of cloud-based Excel file management.

Next Article: How to Clean and Preprocess Text Data with Pandas (3 examples)

Previous Article: Pandas: How to read and update Google Sheet files (2 examples)

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)