Understanding Serialization and Deserialization
Serialization is the process of converting a data structure, like a struct (short for structure) in programming, into a format that can be easily stored or transmitted, and later reconstructed. Deserialization is the inverse process where the stored format is read and reformed back into the struct.
Basic Serialization and Deserialization with JSON
Let's start with a basic example using JSON in Python. JSON is a common format for data interchange as it's easy to read and write for both humans and machines.
import json
# Define a simple struct using a class
data_struct = {
"name": "Alice",
"age": 30,
"job": "Engineer"
}
# Serialize struct to JSON string
data_json = json.dumps(data_struct)
print("Serialized Data:", data_json)
# Deserialize JSON string back to struct
reconstructed_data = json.loads(data_json)
print("Deserialized Data:", reconstructed_data)
Intermediate Level: Custom Formats in Python
Sometimes, you might need to use a custom format for serialization. Below is an example of how to serialize and deserialize a struct to and from a custom colon-separated format.
# Define a more complex struct
class Employee:
def __init__(self, name, age, job):
self.name = name
self.age = age
self.job = job
# Custom serialization
employee = Employee('Bob', 28, 'Designer')
# Serialize (Convert to custom string format)
serialized_data = f"{employee.name}:{employee.age}:{employee.job}"
print("Custom Serialized Data:", serialized_data)
# Custom deserialization
name, age, job = serialized_data.split(":")
deserialized_employee = Employee(name, int(age), job)
print("Custom Deserialized Data:", vars(deserialized_employee))
Advanced Approaches: Implementing Serialization for Nested Structures
As you become adept at basic and intermediate serialization, handling complex data types such as nested structures becomes essential. Let's explore how you can serialize and deserialize structs containing nested fields.
from typing import List
class Department:
def __init__(self, dept_name, employees: List[Employee]):
self.dept_name = dept_name
self.employees = employees
# Create an example department with nested employees
engineering_dept = Department(
dept_name='Engineering',
employees=[
Employee('Charlie', 24, 'Developer'),
Employee('Dave', 32, 'DevOps')
]
)
# Serialize (Handling nested structure manually)
def serialize_department(department: Department) -> str:
# Convert each employee to custom format
serialized_employees = ",[".join(f"{emp.name}:{emp.age}:{emp.job}" for emp in department.employees)
return f"Dept:{department.dept_name}|Employees:[{serialized_employees}]"
serialized_department = serialize_department(engineering_dept)
print("Serialized Complex Structure:", serialized_department)
# Deserialize
import re
def deserialize_department(data: str) -> Department:
dept_match = re.match(r'Dept:(?P.*)\|Employees:\[(?P.*)\]', data)
dept_name = dept_match.group('dept_name')
employees_data = dept_match.group('employees')
employees = [Employee(*emp.split(':')) for emp in employees_data.split(",["}]
return Department(dept_name, employees)
# Convert serialized string back to object
reconstructed_department = deserialize_department(serialized_department)
print("Deserialized Complex Structure:", {
'dept_name': reconstructed_department.dept_name,
'employees': [vars(emp) for emp in reconstructed_department.employees]
})
By the end of this article, you should have a foundational understanding of serializing and deserializing structs using both common formats like JSON and implementing custom serialization logic for more complex requirements.