Overview
In relational databases using SQL, the GROUP BY clause is a powerful tool for combining rows that have the same values in specified columns into summary rows. When working with MongoDB through Mongoose, a popular ODM (Object Document Mapper) for Node.js, developers often need similar functionality. In MongoDB, this is achieved through the aggregation framework, which can be somewhat more complex than SQL’s GROUP BY clause but offers vast flexibility and control over data processing. In this tutorial, I will guide you through using Mongoose to perform group operations analogous to SQL’s GROUP BY, including aggregate functions and some advanced patterns. This guide assumes that you are comfortable with the basics of Mongoose and understand how aggregation in a NoSQL context is different from SQL.
Basic Grouping
First, let’s see how to group documents in Mongoose. Here’s a simple example:
const mongoose = require('mongoose');
const { Schema } = mongoose;
const orderSchema = new Schema({
productId: Number,
quantity: Number
});
const Order = mongoose.model('Order', orderSchema);
async function groupOrders() {
const groupedOrders = await Order.aggregate([
{
$group: {
_id: '$productId',
totalQuantity: { $sum: '$quantity' }
}
}
]);
console.log(groupedOrders);
}
groupOrders();
In the code above, we’re using Mongoose’s aggregate
method to group orders by productId
, then summing the quantity of each grouped item into a field named totalQuantity
. The $group
stage in the pipeline specifies the criteriia for grouping.
Nesting and More Complex Grouping
If you want to group by a combination of fields or include a nested document, you’ll need to use MongoDB’s expression syntax:
...
async function groupOrdersByDateAndProduct() {
const groupedOrders = await Order.aggregate([
{
$group: {
_id: {
day: { $dayOfMonth: '$date' },
month: { $month: '$date' },
year: { $year: '$date' },
productId: '$productId'
},
totalQuantity: { $sum: '$quantity' }
}
}
]);
console.log(groupedOrders);
}
groupOrdersByDateAndProduct();
In this expanded example, the group key is now an object consisting of the day, month, and year extracted from a date
field, as well as the product ID. This way, you’re grouping not just by product but by the exact date as well.
Adding Match and Sort Stages
If you only want to group a subset of your documents or you need them to be sorted, you can use the $match
and $sort
stages:
...
async function groupAndSortOrders() {
const groupedOrders = await Order.aggregate([
{ $match: { date: { $gte: new Date('2021-01-01') } } },
{ $sort: { date: -1 } },
{
$group: {
_id: '$productId',
totalQuantity: { $sum: '$quantity' },
latestOrder: { $first: '$date' }
}
}
]);
console.log(groupedOrders);
}
groupAndSortOrders();
In this pipeline, we first filter orders from 2021 onwards, sort them in descending order by date, and during the group stage, aside from summing quantities, we pick the date of the latest order using the $first
operator – but since the documents are sorted, $first
will give us the most recent document’s date.
Using Aggregate Functions
Mongoose supports a variety of aggregate functions that you can use to calculate values, such as average, maximum, and minimum. Here’s how you might use these:
...
async function calculateStats() {
const productStats = await Order.aggregate([
{
$group: {
_id: '$productId',
maxQuantity: { $max: '$quantity' },
minQuantity: { $min: '$quantity' },
avgQuantity: { $avg: '$quantity' }
}
}
]);
console.log(productStats);
}
calculateStats();
For each product ID, this pipeline calculates the maximum, minimum, and average quantity.
Advanced Aggregation Patterns
For more complex scenarios, such as when needing to perform actions akin to SQL’s JOINs, you’ll likely need to make use of the $lookup
operator and multiple stages. This requires careful structuring of your pipeline stages.
Let’s look at an example where we might join ‘orders’ with a ‘products’ collection:
...
async function joinCollections() {
const orderProductInfo = await Order.aggregate([
{
$lookup: {
from: 'products', // This should be the MongoDB collection name
localField: 'productId',
foreignField: '_id',
as: 'productInfo'
}
},
{
$unwind: '$productInfo'
}
]);
console.log(orderProductInfo);
}
joinCollections();
This code enriches each order with information from corresponding products. The $unwind
stage is used to deconstruct the arrays resulting from the join to allow for further operations, such as grouping.
Conclusion
To sum up, MongoDB’s aggregation framework, accessed via Mongoose in a Node.js application, offers a rich and versatile toolkit for grouping and summarizing data. Although it differs from SQL’s GROUP BY semantically and syntactically, it fulfills the same core need and beyond, due to its capability to handle more sophisticated tasks through a series of processing stages called a pipeline. As with any complex system, there is a learning curve, but with practice, you’ll likely find MongoDB’s aggregation capabilities equally as powerful, if not more so.
While this tutorial covered basics to more complex uses, exploring the remaining aggregation operators like $project
, $addFields
, $reduce
, and many others is a beneficial continuation of learning Mongoose and MongoDB. Be sure to reference the official Mongoose documentation for the latest features and syntactic nuances as you expand your aggregation toolkit.