Sling Academy
Home/Python/PyMongo: How to Define and Use Custom Types

PyMongo: How to Define and Use Custom Types

Last updated: February 12, 2024

Overview

Working with MongoDB in Python becomes remarkably intuitive with the aid of PyMongo, the official MongoDB Python driver. However, as developers dive deeper, they might find themselves in need of storing and querying data of custom types that aren’t natively supported by MongoDB. This tutorial guides you through the process of defining and using custom types in PyMongo, which can significantly extend its utility in real-world applications.

Getting Started

Firstly, ensure that PyMongo is installed in your Python environment. You can install PyMongo using pip:

pip install pymongo

With PyMongo installed, the initial step involves understanding how MongoDB manages custom types. MongoDB stores documents in BSON format, which supports various data types, including integers, strings, and dates. However, for custom types, we use the Binary format, converted from Python objects using serialization and deserialization techniques.

Defining Custom Types

Let’s define a simple custom type. Imagine a Python application that manages books, including a field for book dimensions which isn’t a supported BSON type.

class BookDimension:
    def __init__(self, width, height, depth):
        self.width = width
        self.height = height
        self.depth = depth

    def __repr__(self):
        return f'BookDimension(width={self.width}, height={self.height}, depth={self.depth})'

This custom type needs to be serialized before being stored in MongoDB. PyMongo allows for this through the use of custom encoders and decoders.

Serializing Custom Types

To serialize the BookDimension object into a BSON-friendly format, you can use the `bson.Binary` class along with Python’s `pickle` module for the serialization process.

import pickle
from bson.binary import Binary

def serialize_book_dimension(book_dimension):
    return Binary(pickle.dumps(book_dimension))

Deserializing Custom Types

Equally important is the ability to recover the original Python object from the stored Binary data. This process is known as deserialization.

def deserialize_book_dimension(book_dimension_binary):
    return pickle.loads(book_dimension_binary)

With these functions in hand, you can now insert documents containing custom types into your MongoDB database.

Inserting Documents with Custom Types

To demonstrate, let’s consider a MongoDB collection named ‘books’ and insert a document including our custom type.

from pymongo import MongoClient

client = MongoClient('mongodb_connection_string')
books_collection = client.mydatabase.books

document = {
    'title': 'Python Programming',
    'dimensions': serialize_book_dimension(BookDimension(7, 10, 1.5))
}

books_collection.insert_one(document)

This document will now contain the ‘dimensions’ field as Binary data, which could be deserialized back to a BookDimension object upon retrieval.

Retrieving and Using Custom Types

To retrieve documents containing custom types, you’ll essentially reverse the serialization process.

document = books_collection.find_one({'title': 'Python Programming'})
deserialized_dimensions = deserialize_book_dimension(document['dimensions'])

print(deserialized_dimensions)

This will output something similar to:

BookDimension(width=7, height=10, depth=1.5)

Advanced Use Case: Custom Type with Query Support

For more advanced scenarios, such as querying based on attributes of a custom type, you will need additional strategies, such as storing serialized data alongside queryable fields or using MongoDB’s aggregation framework for more complex queries.

For instance, storing dimensions separately for querying could look like this:

document = {
    'title': 'Python Programming',
    'dimensions': serialize_book_dimension(BookDimension(7, 10, 1.5)),
    'width': 7,
    'height': 10,
    'depth': 1.5
}

books_collection.insert_one(document)

With this structure, it’s possible to query the collection based on height, width, or depth directly.

Conclusion

Incorporating and managing custom types in a MongoDB powered application using PyMongo requires understanding of serialization and deserialization processes. This knowledge enables the seamless integration and querying of complex data structures, providing enhanced flexibility for your applications. By carefully implementing these techniques, developers can efficiently extend the capabilities of their MongoDB collections.

Next Article: PyMongo: How to save datetime with timezone

Previous Article: PyMongo: Query and update documents based on nested fields

Series: Data Persistence in Python – Tutorials & Examples

Python

You May Also Like

  • Introduction to yfinance: Fetching Historical Stock Data in Python
  • Monitoring Volatility and Daily Averages Using cryptocompare
  • Advanced DOM Interactions: XPath and CSS Selectors in Playwright (Python)
  • Automating Strategy Updates and Version Control in freqtrade
  • Setting Up a freqtrade Dashboard for Real-Time Monitoring
  • Deploying freqtrade on a Cloud Server or Docker Environment
  • Optimizing Strategy Parameters with freqtrade’s Hyperopt
  • Risk Management: Setting Stop Loss, Trailing Stops, and ROI in freqtrade
  • Integrating freqtrade with TA-Lib and pandas-ta Indicators
  • Handling Multiple Pairs and Portfolios with freqtrade
  • Using freqtrade’s Backtesting and Hyperopt Modules
  • Developing Custom Trading Strategies for freqtrade
  • Debugging Common freqtrade Errors: Exchange Connectivity and More
  • Configuring freqtrade Bot Settings and Strategy Parameters
  • Installing freqtrade for Automated Crypto Trading in Python
  • Scaling cryptofeed for High-Frequency Trading Environments
  • Building a Real-Time Market Dashboard Using cryptofeed in Python
  • Customizing cryptofeed Callbacks for Advanced Market Insights
  • Integrating cryptofeed into Automated Trading Bots