Pydantic: Parsing and Validating JSON Data

Updated: December 14, 2023 By: Wolf Post a comment

First off, let’s understand what Pydantic is. Pydantic is a data validation and settings management library using Python type annotations. It allows you to create data classes where you can define how data should be validated, transformed, and serialized/deserialized.

JSON, or JavaScript Object Notation, is a lightweight data-interchange format that is easy for humans to read and write. It’s used widely in many web-based applications and APIs. In the Python ecosystem, there is a powerful library called Pydantic that can assist us in parsing and validating JSON data. This article will guide you through the process of JSON parsing in Pydantic, providing various solutions to common issues you might encounter along the way.

Built-in JSON Parsing in Pydantic

Pydantic comes with in-built JSON parsing capabilities. It offers significant performance improvements without requiring the use of a third-party library. It also provides support for custom errors and strict specifications. Let’s delve into an example of Pydantic’s built-in JSON parsing.

from datetime import date
from typing import Tuple
from pydantic import BaseModel, ConfigDict, ValidationError


class User(BaseModel):
    model_config = ConfigDict(strict=True)
    name: str
    birth: date
    location: Tuple[float, float]


json_data = (
    '{"name": "Mr. Wolf", "birth": "1955-01-28", "location": [33.3333, 44.4444]}'
)

print(User.model_validate_json(json_data))

Output:

name='Mr. Wolf' birth=datetime.date(1955, 1, 28) location=(33.3333, 44.4444)

In the above code, we have a BaseModel class named User with three fields name ,birth, and location. name is of type str, birth is of type date and location is a tuple of two floats.

The model_validate_json method is used to parse the JSON data and it returns the validated data in the form of a Pydantic model object.

Validating JSON and Error Handling

One of the significant advantages of using Pydantic is its robust error handling. If the input data does not conform to the model’s specifications, Pydantic will raise a ValidationError.

Let’s extend the previous example and see how Pydantic handles invalid input data.

Example:

# Extend the previous example
json_data_2 = '{"name": "Mr. Wolf", "birth": "1955-01-28"}'
try:
    User.model_validate(json_data_2)
except ValidationError as e:
    print(e)

Output:

1 validation error for User
  Input should be a valid dictionary or instance of User [type=model_type, input_value='{"name": "Mr. Wolf", "birth": "1955-01-28"}', input_type=str]

In the above code, we are passing the same values to the model_validate method instead of model_validate_json. As a result, Pydantic raises a ValidationError because the strict configuration is enabled.

Pydantic’s JSON Parser: Jiter

Starting from version 2.5.0, Pydantic uses jiter, a fast and iterable JSON parser, to parse JSON data. Using jiter compared to other parsers like serde, results in modest performance improvements that are expected to improve in the future.

Jiter is almost entirely compatible with serde, with one noticeable enhancement being that jiter supports the deserialization of inf and NaN values.

Conclusion

In this article, we explored the powerful features of Pydantic for parsing and validating JSON data. We dove deep into its built-in JSON parsing, robust error handling, and custom model creation. From this point, you’re pretty good to go. Happy coding & have a nice day!