What is the equivalent of SQL ‘GROUP BY’ in Mongoose?

Updated: December 30, 2023 By: Guest Contributor Post a comment

Overview

In relational databases using SQL, the GROUP BY clause is a powerful tool for combining rows that have the same values in specified columns into summary rows. When working with MongoDB through Mongoose, a popular ODM (Object Document Mapper) for Node.js, developers often need similar functionality. In MongoDB, this is achieved through the aggregation framework, which can be somewhat more complex than SQL’s GROUP BY clause but offers vast flexibility and control over data processing. In this tutorial, I will guide you through using Mongoose to perform group operations analogous to SQL’s GROUP BY, including aggregate functions and some advanced patterns. This guide assumes that you are comfortable with the basics of Mongoose and understand how aggregation in a NoSQL context is different from SQL.

Basic Grouping

First, let’s see how to group documents in Mongoose. Here’s a simple example:

const mongoose = require('mongoose');
const { Schema } = mongoose;

const orderSchema = new Schema({
  productId: Number,
  quantity: Number
});

const Order = mongoose.model('Order', orderSchema);

async function groupOrders() {
  const groupedOrders = await Order.aggregate([
    {
      $group: {
        _id: '$productId',
        totalQuantity: { $sum: '$quantity' }
      }
    }
  ]);

  console.log(groupedOrders);
}

groupOrders();

In the code above, we’re using Mongoose’s aggregate method to group orders by productId, then summing the quantity of each grouped item into a field named totalQuantity. The $group stage in the pipeline specifies the criteriia for grouping.

Nesting and More Complex Grouping

If you want to group by a combination of fields or include a nested document, you’ll need to use MongoDB’s expression syntax:

...

async function groupOrdersByDateAndProduct() {
  const groupedOrders = await Order.aggregate([
    {
      $group: {
        _id: {
          day: { $dayOfMonth: '$date' },
          month: { $month: '$date' },
          year: { $year: '$date' },
          productId: '$productId'
        },
        totalQuantity: { $sum: '$quantity' }
      }
    }
  ]);

  console.log(groupedOrders);
}

groupOrdersByDateAndProduct();

In this expanded example, the group key is now an object consisting of the day, month, and year extracted from a date field, as well as the product ID. This way, you’re grouping not just by product but by the exact date as well.

Adding Match and Sort Stages

If you only want to group a subset of your documents or you need them to be sorted, you can use the $match and $sort stages:

...

async function groupAndSortOrders() {
  const groupedOrders = await Order.aggregate([
    { $match: { date: { $gte: new Date('2021-01-01') } } },
    { $sort: { date: -1 } },
    {
      $group: {
        _id: '$productId',
        totalQuantity: { $sum: '$quantity' },
        latestOrder: { $first: '$date' }
      }
    }
  ]);

  console.log(groupedOrders);
}

groupAndSortOrders();

In this pipeline, we first filter orders from 2021 onwards, sort them in descending order by date, and during the group stage, aside from summing quantities, we pick the date of the latest order using the $first operator – but since the documents are sorted, $first will give us the most recent document’s date.

Using Aggregate Functions

Mongoose supports a variety of aggregate functions that you can use to calculate values, such as average, maximum, and minimum. Here’s how you might use these:

...

async function calculateStats() {
  const productStats = await Order.aggregate([
    {
      $group: {
        _id: '$productId',
        maxQuantity: { $max: '$quantity' },
        minQuantity: { $min: '$quantity' },
        avgQuantity: { $avg: '$quantity' }
      }
    }
  ]);

  console.log(productStats);
}

calculateStats();

For each product ID, this pipeline calculates the maximum, minimum, and average quantity.

Advanced Aggregation Patterns

For more complex scenarios, such as when needing to perform actions akin to SQL’s JOINs, you’ll likely need to make use of the $lookup operator and multiple stages. This requires careful structuring of your pipeline stages.

Let’s look at an example where we might join ‘orders’ with a ‘products’ collection:

...

async function joinCollections() {
  const orderProductInfo = await Order.aggregate([
    {
      $lookup: {
        from: 'products', // This should be the MongoDB collection name
        localField: 'productId',
        foreignField: '_id',
        as: 'productInfo'
      }
    },
    {
      $unwind: '$productInfo'
    }
  ]);

  console.log(orderProductInfo);
}

joinCollections();

This code enriches each order with information from corresponding products. The $unwind stage is used to deconstruct the arrays resulting from the join to allow for further operations, such as grouping.

Conclusion

To sum up, MongoDB’s aggregation framework, accessed via Mongoose in a Node.js application, offers a rich and versatile toolkit for grouping and summarizing data. Although it differs from SQL’s GROUP BY semantically and syntactically, it fulfills the same core need and beyond, due to its capability to handle more sophisticated tasks through a series of processing stages called a pipeline. As with any complex system, there is a learning curve, but with practice, you’ll likely find MongoDB’s aggregation capabilities equally as powerful, if not more so.

While this tutorial covered basics to more complex uses, exploring the remaining aggregation operators like $project, $addFields, $reduce, and many others is a beneficial continuation of learning Mongoose and MongoDB. Be sure to reference the official Mongoose documentation for the latest features and syntactic nuances as you expand your aggregation toolkit.