Introduction
When working with data in a Python application, manipulating and storing CSV files is common. In this tutorial, we’ll walk through the steps to save CSV data into databases with SQLAlchemy, Python’s SQL toolkit and ORM.
Getting Started with SQLAlchemy
To begin, we need to set up SQLAlchemy. First, ensure that SQLAlchemy is installed in your Python environment. You can do this using pip:
pip install SQLAlchemy
Next, import the necessary modules and configure your database URI:
from sqlalchemy import create_engine, MetaData
engine = create_engine('sqlite:///your-database.db')
metadata = MetaData()
Defining the Model
SQLAlchemy models represent tables and columns in your database:
from sqlalchemy import Table, Column, String, Integer
users = Table('users', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
Column('age', Integer))
metadata.create_all(engine)
Reading from CSV
To handle CSV files, we’ll use Python’s built-in csv
module:
import csv
with open('users.csv', mode='r') as file:
csv_reader = csv.reader(file)
header = next(csv_reader) # Skip the header row
for row in csv_reader:
print(row)
Inserting Data into the Database
You can insert data into your database using SQLAlchemy’s insert()
function:
from sqlalchemy import insert
with open('users.csv', mode='r') as file:
csv_reader = csv.DictReader(file)
for row in csv_reader:
with engine.connect() as connection:
result = connection.execute(insert(users), row)
print(result.inserted_primary_key)
Using Sessions for Transactions
For a more advanced usage, incorporate sessions:
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
with open('users.csv', mode='r') as file:
session = Session()
csv_reader = csv.DictReader(file)
for row in csv_reader:
session.execute(insert(users), row)
session.commit()
session.close()
Handling Relationships
Here’s how you can define and use relationships with SQLAlchemy to import related CSV data:
from sqlalchemy.orm import relationship
# Assuming 'users' table is already defined
posts = Table('posts', metadata,
Column('id', Integer, primary_key=True),
Column('user_id', Integer, ForeignKey('users.id')),
Column('content', String),
relationship('User', back_populates='posts'))
users = Table('users', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
Column('posts', relationship('Post', back_populates='user')))
metadata.create_all(engine)
# Then follow similar steps to insert user posts into 'posts' table.
Using Pandas for Data Handling
Pandas can be a powerful ally when dealing with data frames and complex CSVs. After manipulating your data with Pandas, save it to the database:
import pandas as pd
# Load data into a pandas DataFrame
df = pd.read_csv('users.csv')
df.to_sql('users', con=engine, if_exists='append', index=False)
Conclusion
In this tutorial, we explored various methods to save CSV data in a database using SQLAlchemy. Mastering these techniques can significantly streamline the process of importing and exporting data between CSV files and databases within your Python applications.