PyMongo Upsert Examples: Update if exists, else insert

Updated: February 8, 2024 By: Guest Contributor Post a comment

Overview

In this tutorial, we dive into the world of MongoDB operations using PyMongo, a Python distribution containing tools for working with MongoDB. One powerful feature offered by MongoDB is the upsert operation. In essence, an upsert operation checks if a document exists in a collection and updates it if it does; otherwise, it inserts a new document. This functionality is crucial for maintaining data integrity and ensuring your database stays up-to-date without unnecessary duplications. We will explore how to execute upsert operations in PyMongo with a variety of examples, from basic to advanced, to help you get a firm grasp of this concept.

Setting up PyMongo

Before we dive into upsert examples, ensure you have PyMongo installed. If not, you can install it using pip:

pip install pymongo

And, ensure you have a running MongoDB instance. You can either set up MongoDB locally or use a cloud service like MongoDB Atlas.

Basic Upsert Example

Let’s begin with the most straightforward example. Suppose you have a collection named users and you want to update a user’s email if the user exists, or insert a new document if the user doesn’t exist:

from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')
db = client['testdb']
users = db['users']

result = users.update_one({'username': 'johndoe'},
                          {'$set': {'email': '[email protected]'}},
                          upsert=True)
print(f"Matched: {result.matched_count}, Modified: {result.modified_count}, Upserted ID: {result.upserted_id}")

This code connects to the MongoDB database and the users collection, attempts to update a document where the username is ‘johndoe’, and if no document is found, inserts a new document with the username and email address provided. The upsert parameter is set to True, enabling this behavior.

Handling Complex Documents

As we move to more advanced examples, consider we have more complex documents involving nested structures or arrays. For example, updating a user’s preferences:

from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')
db = client['testdb']
collection = db['complex_users']

result = collection.update_one(
    {'username': 'janedoe'},
    {'$set': {'preferences.notifications.email': True}},
    upsert=True)
print(f"Upserted ID: {result.upserted_id}")

This operation digs into a nested field (preferences.notifications.email) within a document and sets it. If a document with ‘janedoe’ as username does not exist, it will initialize it with the provided path and value.

Using update_many with Upsert

There’s more to updating documents in MongoDB than just using update_one. With the right conditions, you could also employ update_many for affecting multiple documents. Though less common in an upsert context, it holds potential for bulk operations. For demonstration, suppose we want to tag all users from a certain city:

from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')
db = client['testdb']
collection = db['users_tag']

result = collection.update_many(
    {'city': 'New York'},
    {'$set': {'tag': 'NY Resident'}},
    upsert=True)
print(f"Matched: {result.matched_count}, Modified: {result.modified_count}")

Remember, upsert with update_many is rare since the operation is designed to update multiple existing documents. An upsert here would only insert a single document if no matches were found, which might not be the expected outcome in a bulk operation scenario.

Upserting with Complex Queries

You can also perform upserts using more complex queries and conditions. For example, adjusting a document based on more intricate query conditions:

from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')
db = client['testdb']
collection = db['advanced_users']

result = collection.update_one(
    {'username': 'johndoe', 'status': {'$ne': 'inactive'}},
    {'$set': {'last_login': '2023-04-01'}},
    upsert=True)
print(f"Upserted ID: {result.upserted_id}")

In this example, we specify not only the username but also a condition that the user’s status must not be ‘inactive’ for the upsert operation to take place. This demonstrates the flexibility of MongoDB’s querying capabilities integrated with upsert operations.

Conclusion

PyMongo offers a powerful and versatile way to interact with MongoDB, with upsert being one of its most useful operations. By understanding and utilizing upserts, you can efficiently maintain the integrity and relevance of your database data with minimal effort. Whether you’re managing simple collections or dealing with complex documents and queries, PyMongo’s upsert capabilities are an essential tool in your development arsenal.