This concise example-based article shows you how to validate data with Python dataclasses.
Data validation is an essential part of any data processing system. It ensures that the data received by the system is correct and in the expected format. Python’s dataclass provides an easy way to validate data during object initialization. Let’s see how it’s done.
In the following example, we are going to define a dataclass named Person with 2 attributes: name and age. Our goal is to implement validation logic to ensure that the age cannot be outside the range of 0 to 150.
Step 1 – Defining Dataclass
Using the @dataclass decorator:
from dataclasses import dataclass @dataclass class Person: name: str age: int
Step 2 – Validating Data during Initialization
You can add custom validation logic by adding a __post_init__() method in the class. This method is called after the object is initialized with the given values. You can raise an exception if the data is not in the expected format:
from dataclasses import dataclass @dataclass class Person: name: str age: int def __post_init__(self): if not isinstance(self.name, str): raise TypeError('Name should be of type str') if not isinstance(self.age, int): raise TypeError('Age should be of type int') if self.age < 0 or self.age > 150: raise ValueError('Age must be between 0 and 150')
Step 3 – Test It
Try to create a person whose age is 160:
person = Person('John', 160)
And you will get this error:
ValueError: Age must be between 0 and 150
Let’s try to initialize another person object with a non-string name:
person = Person(123, 123)
And you will run into this:
TypeError: Name should be of type str
This is a real-life example that you might face when building registration and login-related systems. We will create a dataclass User with 2 fields: email and password. Our goal is to make sure:
- Email must have the correct format (we will use regular expressions for this)
- Password must be between 6 and 12 characters in length
from dataclasses import dataclass import re @dataclass class User: email: str password: str def __post_init__(self): # Validate email if not re.match(r"[^@]+@[^@]+\.[^@]+", self.email): raise ValueError("Invalid email address.") # Validate password length if not 8 <= len(self.password) <= 12: raise ValueError("Password length should be between 8 and 12 characters.")
Now, let’s create a User object with an invalid email address and see how the validation works:
u = User(email="test@slingacademy", password="password123")
ValueError: Invalid email address.
What about an invalid password?
u = User(email="[email protected]", password="1234")
You will get ValueError:
ValueError: Password length should be between 8 and 12 characters.
Let’s do the right thing:
u = User(email="firstname.lastname@example.org", password="password123") print(u)
And we pass the validation:
User(email='[email protected]', password='password123')
That’s it. Happy coding and have a nice day!