Sling Academy
Home/MongoDB/The maximum size of a document in MongoDB

The maximum size of a document in MongoDB

Last updated: February 02, 2024

Introduction

MongoDB, a leading NoSQL database, is widely known for its flexibility in handling large datasets and varied document structures. It’s known for its BSON (Binary JSON) format, which allows for the efficient storage and retrieval of complex documents. However, just like any database, MongoDB has its limits and one such is the maximum size of a document. In this article, we will explore the scale to which MongoDB can accommodate document size and best practices for managing large documents, supported by relevant code examples.

Understanding MongoDB’s Document Size Limit

MongoDB imposes a size limit of 16MB on a single document. This limitation ensures that a single document cannot grow beyond a manageable size, allowing for consistent performance even as your dataset grows. Let’s start with basic operations in MongoDB then move on to handling larger documents. We assume that you have MongoDB installed and a database called ‘testdb’.

To begin with, let’s insert a simple document into a collection called ‘simpledocs’. We’ll use the MongoDB shell:

use testdb
 db.simpledocs.insert({name: 'Tutorial', description: 'Learn about MongoDB document sizes.'})

Let’s check the size of our inserted document.

db.simpledocs.stats().avgObjSize

This will return the average size of the objects within the collection in bytes.

Approaching the Limit

As your application evolves, your documents might grow in size. It’s essential to keep track of their sizes to prevent hitting the limit unexpectedly. Here’s how to find the largest documents in your collection:

db.simpledocs.find().sort({$natural: -1}).limit(1).forEach(function(doc) {
  printjson({ _id: doc._id, size: Object.bsonsize(doc) });
});

This code orders documents by the natural order (basically insertion order unless the order is changed by other operations such as an update) and retrieves the size of the largest.

Working with Large Documents

If your use case requires handling large documents, you will want to be aware of their growth. Here’s a sample operation that could lead to a growth in a document’s size:

db.largeDocs.update(
  {_id: 'largeDocId'},
  {$push: { 'largeArrayField': 'A lot of content that could contribute to document size growth' }}
);

It’s also important to frequently check the size of such documents:

var docSize = Object.bsonsize(db.largeDocs.findOne({_id: 'largeDocId'}))
print("Document size in bytes: " + docSize);

If you find that a document’s size is approaching 16MB, one solution is to consider a pattern like Bucketing, which involves segmenting the data into multiple documents or utilizing GridFS, a MongoDB specification for storing and retrieving files that exceed the BSON-document size limit.

GridFS and Large Files

When working with files larger than 16MB, MongoDB supports GridFS. The following snippet demonstrates how to store a large file using GridFS in a Node.js application.

const { MongoClient, GridFSBucket } = require('mongodb');
const fs = require('fs');

MongoClient.connect('mongodb://localhost:27017', { useUnifiedTopology: true }, (err, client) => {
  if(err) throw err;
  const db = client.db('testdb');
  const bucket = new GridFSBucket(db, { bucketName: 'largeFiles' });

  fs.createReadStream('/path/to/large/file').pipe(
    bucket.openUploadStream('largeFile')
  ).on('error', function(error) {
    assert.ifError(error);
  }).on('finish', function() {
    console.log('File uploaded successfully.');
    client.close();
  });
});

This connects to the ‘testdb’ database, creates a GridFS bucket, and uses a Node.js stream to read a file from the system and pipe it into MongoDB under the provided file name.

Advanced Document Patterns

For complex applications, design patterns like the Outlier Pattern where you split out large fields into separate documents or the Extended Reference Pattern, combining bucketing and references, may be suitable. These patterns can maximize the efficiency of your data model and adhere to the size limits. When using multiple documents to represent what would be a larger document, you will employ references to link them:

db.extendedRef.insertMany([
  { _id: 'parentDoc', otherFields: '...'},
  { parentId: 'parentDoc', largeField: 'Content that necessitated extending into another document.' }
]);

You would use the ‘parentId’ field to join data when necessary.

Monitoring Document Growth

Continuous monitoring is crucial when working near MongoDB’s document limit. Writing scripts or employing monitoring tools can assist in this regard. The earlier you detect an approaching limit, the easier it is to refactor the structure or apply patterns.

Best Practices Summarized

  • Regularly check the size of your documents and approach data architecture with the 16MB limit in mind.
  • Consider schema design patterns that break up large documents when possible.
  • Use GridFS for files that naturally exceed 16MB.
  • Implement monitoring to proactively handle document growth.

Conclusion

The 16MB document size limit in MongoDB is a guideline that fosters good data modeling and promotes consistency in database performance. By understanding this limit and implementing strategies to deal with large documents, you can ensure that your MongoDB deployments remain scalable and efficient.

Next Article: MongoDB: Best practices to name collections and fields

Previous Article: MongoDB: How to move a collection to a different database (4 ways)

Series: MongoDB Tutorials

MongoDB

You May Also Like

  • MongoDB: How to combine data from 2 collections into one
  • Hashed Indexes in MongoDB: A Practical Guide
  • Partitioning and Sharding in MongoDB: A Practical Guide (with Examples)
  • Geospatial Indexes in MongoDB: How to Speed Up Geospatial Queries
  • Understanding Partial Indexes in MongoDB
  • Exploring Sparse Indexes in MongoDB (with Examples)
  • Using Wildcard Indexes in MongoDB: An In-Depth Guide
  • Matching binary values in MongoDB: A practical guide (with examples)
  • Understanding $slice operator in MongoDB (with examples)
  • Caching in MongoDB: A practical guide (with examples)
  • CannotReuseObject Error: Attempted illegal reuse of a Mongo object in the same process space
  • How to perform cascade deletion in MongoDB (with examples)
  • MongoDB: Using $not and $nor operators to negate a query
  • MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection
  • References (Manual Linking) in MongoDB: A Developer’s Guide (with Examples)
  • MongoDB: How to see all fields in a collection (with examples)
  • Type checking in MongoDB: A practical guide (with examples)
  • How to query an array of subdocuments in MongoDB (with examples)
  • MongoDB: How to compare 2 documents (with examples)