Sling Academy
Home/MongoDB/Understanding ObjectId data type in MongoDB (with examples)

Understanding ObjectId data type in MongoDB (with examples)

Last updated: February 01, 2024

Introduction to ObjectId in MongoDB

The ObjectId is a special data type used by MongoDB to serve as the primary key, `_id`, for documents within a collection. It ensures document uniqueness and is automatically generated if not provided. Each ObjectId is 12 bytes, usually represented as a 24-character hex string.

The first four bytes of an ObjectId are a timestamp, reflecting the creation time of the document. This tutorial delves into the ObjectId type, illustrating its purpose, how to work with it in your MongoDB operations, and revealing some advanced concepts through examples.

Basic Usage of ObjectId

Creating a new ObjectId doesn’t require any arguments, although you can pass a timestamp.

const ObjectId = require('mongodb').ObjectId;

// Create a new ObjectId
document._id = new ObjectId();

The newly created ObjectId is not just random; it contains a 4-byte timestamp, a 5-byte random value, and a 3-byte incrementing counter, ensuring a unique identifier.

Querying by ObjectId

To find a document by its primary key, you must use ObjectId:

db.collection('users').findOne({ _id: ObjectId('507f191e810c19729de860ea') });

This operation returns the document with the specified ObjectId or null if none is found.

ObjectId Data Extraction

Since an ObjectId encodes the creation time, it’s possible to extract the timestamp without querying the database:

const timestamp = document._id.getTimestamp();

getTimestamp() will return a JavaScript Date object representing when the document was created.

Constructing ObjectId with Specific Time

Creating an ObjectId with a certain date:

const date = new Date('2023-01-01T00:00:00Z');
const objectIdFromDate = ObjectId(Math.floor(date.getTime() / 1000).toString(16) + '0000000000000000');

This generates an ObjectId with the specified time encoded within it.

Advantages of Using ObjectId

ObjectIds are small, fast to generate, and ordering them chronologically is straightforward because of the embedded timestamp.

ObjectIds and Indexing

As ObjectIds are the default `_id` field, Mongo ensures a unique index on them, which optimizes retrieval times.

Advanced Usage: ObjectId and Aggregation

Aggregating data by dates embedded in ObjectIds includes grouping by year, month, or day:

db.collection('documents').aggregate([
    {
        $group: {
            _id: {
                year: { $year: "$add_date" },
                month: { $month: "$add_date" },
                day: { $dayOfMonth: "$add_date" }
            },
            count: { $sum: 1 }
        }
    }
]);

The `$add_date` here would be computed from ObjectId, translating the embedded timestamp into a MongoDB date format.

Deconstructing ObjectIds in Aggregation:

The following example deconstructs the ObjectId within an aggregation pipeline to extract various elements like timestamp and counter:

db.collection('documents').aggregate({
    $project: {
        timestamp: {
            $toDate: {
                $multiply: ['$hexToDecimal', {
                    $substr: ['$_id', 0, 8]
                }], 1000
            }
        }
        // Other fields extracted here
    }
});

This uses the fact that the first 8 characters represent the timestamp in hexadecimal format.

Limitations

Even though ObjectIds are useful, they are not devoid of limitations. Collision probability in distributed systems must be considered if the ObjectId seeding process isn’t strictly managed; a case could lead to non-unique `_id`s.

Handling ObjectId in Different Programming Languages

In multi-language environments, it’s essential to handle ObjectId correctly. Languages like JavaScript, Python, and Java have libraries to work with MongoDB ObjectIds, allowing cross-platform consistency.

Example in Python

from bson.objectid import ObjectId

new_id = ObjectId()
print(new_id.binary)

Conclusion

The ObjectId type in MongoDB offers a versatile approach to unique document identification. Its time-sequential nature makes it valuable for ordered operations and sharding. By understanding its structure and tailoring it to your needs through advanced manipulations, you can enhance your MongoDB experience.

Next Article: MongoDB Date data type: A practical guide (with examples)

Previous Article: Understanding Materialized Views in MongoDB (through Examples)

Series: MongoDB Tutorials

MongoDB

You May Also Like

  • MongoDB: How to combine data from 2 collections into one
  • Hashed Indexes in MongoDB: A Practical Guide
  • Partitioning and Sharding in MongoDB: A Practical Guide (with Examples)
  • Geospatial Indexes in MongoDB: How to Speed Up Geospatial Queries
  • Understanding Partial Indexes in MongoDB
  • Exploring Sparse Indexes in MongoDB (with Examples)
  • Using Wildcard Indexes in MongoDB: An In-Depth Guide
  • Matching binary values in MongoDB: A practical guide (with examples)
  • Understanding $slice operator in MongoDB (with examples)
  • Caching in MongoDB: A practical guide (with examples)
  • CannotReuseObject Error: Attempted illegal reuse of a Mongo object in the same process space
  • How to perform cascade deletion in MongoDB (with examples)
  • MongoDB: Using $not and $nor operators to negate a query
  • MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection
  • References (Manual Linking) in MongoDB: A Developer’s Guide (with Examples)
  • MongoDB: How to see all fields in a collection (with examples)
  • Type checking in MongoDB: A practical guide (with examples)
  • How to query an array of subdocuments in MongoDB (with examples)
  • MongoDB: How to compare 2 documents (with examples)