Sling Academy
Home/MongoDB/Using $group aggregation stage in MongoDB (with examples)

Using $group aggregation stage in MongoDB (with examples)

Last updated: February 03, 2024

Introduction

MongoDB is a NoSQL database that provides a high level of flexibility and scalability for managing large amounts of unstructured data. One of the most powerful features of MongoDB is its aggregation framework, which allows users to perform complex data processing and analysis by defining a pipeline of operations. At the heart of this framework lies the $group stage, which is used to group input documents by a specified identifier and apply accumulators to perform various operations on the grouped data.

This tutorial will dive deep into the $group aggregation stage in MongoDB. We will begin with basic concepts, gradually moving to more complex operations, complete with code examples and their respective outputs.

Understanding the $group Stage

The $group stage is akin to the GROUP BY clause in SQL, used for grouping documents according to one or more fields. The grouped documents can then be processed using accumulator operators like $sum, $avg, $max, and more. The syntax of the $group stage is:

{
    $group: {
        _id: <expression>, // Field to group by
        <field1>: { <accumulator1> : <expression1> },
        ...
        <fieldN>: { <accumulatorN> : <expressionN> }
    }
}

Here, _id is mandatory and specifies the group identifier. Each field after _id applies an accumulator to the collection of documents that share the same group identifier.

Basic Usage of $group

Let’s start with a basic example of using the $group stage to group documents by a single field and count the number of documents in each group.

// Group by category and count products in each category
db.products.aggregate([
    {
        $group : {
            _id : '$category', // Group by 'category' field
            count: { $sum: 1 } // Count documents in each category
        }
    }
]);

In the example above, the documents in the products collection are grouped by the category field. The accumulator $sum is used to add 1 for each document encountered, effectively counting the number of documents in each category.

Grouping by Multiple Fields

You can also group documents by multiple fields by using compound keys in the _id field as shown in the example below:

// Group by category and tags
mdb.products.aggregate([
    {
        $group : {
            _id : { category: '$category', tags: '$tags' },
            count: { $sum: 1 }
        }
    }
]);

In the updated query, documents are now grouped by both category and tags, resulting in a count of documents for every unique combination of category and tags.

Using Accumulators in $group

Accumulators are operations that process data (e.g., sums, averages) for documents within a group. Here’s an example where we calculate the average price of products in each category:

// Calculate average price for each category
db.products.aggregate([
    {
        $group: {
            _id: '$category',
            averagePrice: { $avg: '$price'}
        }
    }
]);

The $avg accumulator calculates the average of all the price fields within each group identified by category.

Combining Multiple Accumulators

You can also combine multiple accumulators within the same $group stage to produce multiple calculations. For example, you can calculate the average, maximum, and minimum price for each category:

// Combine multiple accumulators
mdb.products.aggregate([
    {
        $group: {
            _id: '$category',
            averagePrice: { $avg: '$price'},
            maxPrice: { $max: '$price'},
            minPrice: { $min: '$price'}
        }
    }
]);

This will output documents with the category field as the identifier and fields for the average, maximum, and minimum prices calculated for each category.

Advanced Grouping with Arbitrary Expressions

Besides grouping by direct field references, $group can use expressions that transform the input values. For instance, you could create groups based on the length of a string field:

// Group by string length of the 'name' field
db.products.aggregate([
    {
        $group: {
            _id: { $strLenCP: '$name' }, // Uses a string expression
db.products.aggregate([
    {
        $group: {
            _id: { length: { $strLenCP: '$name' } },
            count: { $sum: 1 }
        }
    }
]);

This aggregation groups the documents based on the computed length of the names of the products and counts how many products have names of the same length.

Nesting ‘$group’ Stages

In some cases, you might need to perform multiple grouping stages consecutively. This can be useful for performing sub-aggregations within groups. Here’s an example:

// Perform nested grouping
mdb.products.aggregate([
    {
        $group: {
            _id: '$category',
            totalSales: { $sum: '$quantitySold' }
        }
    },
    {
        $group: {
            _id: null,
            totalRevenue: { $sum: '$totalSales' }
        }
    }
]);

The first group stage computes the total sales for each category. Then, another group stage without specifying an _id (or setting it to null) aggregates across the entire collection to find cumulative sales across all categories.

Conclusion

MongoDB’s $group aggregation stage is a versatile tool that lets you perform a plethora of operations on your data collections. Whether you’re counting documents, calculating averages, or performing sub-aggregations, $group provides a convenient and powerful way to accomplish a wide variety of tasks. The examples provided illustrate only a fraction of what’s possible, enabling you to build upon these foundations for more complex data analysis and manipulation.

Next Article: MongoDB: Grouping documents by multiple fields (with examples)

Previous Article: MongoDB: 3 Ways to Select N Random Documents

Series: MongoDB Tutorials

MongoDB

You May Also Like

  • MongoDB: How to combine data from 2 collections into one
  • Hashed Indexes in MongoDB: A Practical Guide
  • Partitioning and Sharding in MongoDB: A Practical Guide (with Examples)
  • Geospatial Indexes in MongoDB: How to Speed Up Geospatial Queries
  • Understanding Partial Indexes in MongoDB
  • Exploring Sparse Indexes in MongoDB (with Examples)
  • Using Wildcard Indexes in MongoDB: An In-Depth Guide
  • Matching binary values in MongoDB: A practical guide (with examples)
  • Understanding $slice operator in MongoDB (with examples)
  • Caching in MongoDB: A practical guide (with examples)
  • CannotReuseObject Error: Attempted illegal reuse of a Mongo object in the same process space
  • How to perform cascade deletion in MongoDB (with examples)
  • MongoDB: Using $not and $nor operators to negate a query
  • MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection
  • References (Manual Linking) in MongoDB: A Developer’s Guide (with Examples)
  • MongoDB: How to see all fields in a collection (with examples)
  • Type checking in MongoDB: A practical guide (with examples)
  • How to query an array of subdocuments in MongoDB (with examples)
  • MongoDB: How to compare 2 documents (with examples)