MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection

Updated: February 4, 2024 By: Guest Contributor Post a comment

When working with large datasets in MongoDB, often there’s a need to compute various statistics, such as the sum, minimum, maximum, or average of specific properties across documents. This tutorial will guide you through performing these aggregations using MongoDB’s powerful aggregation framework. You’ll learn to effectively apply these operations within different groupings of your data.

Understanding Aggregation in MongoDB

MongoDB’s aggregation framework is a data processing pipeline that allows you to perform complex data transformations and analysis on your documents. It operates through a series of stages, each transforming the dataset in some manner, until the final desired result is achieved.

Setting Up Your Environment

Ensure you have MongoDB installed and running on your machine. This guide assumes you’re using MongoDB version 4.4 or later, which includes the necessary aggregation capabilities.

Calculating the SUM

Let’s start with a straightforward example. Suppose you have a collection named sales containing documents representing sales transactions. Here’s how you could find the total sales amount (sum) per product.

db.sales.aggregate([
  {
    $group: {
      _id: '$product',
      totalSales: { $sum: '$amount' }
    }
  }
])

In this example, $group is an aggregation pipeline stage that groups the documents by the product field. The totalSales field in the output is calculated by summing up the amount field of all documents that share the same product.

Finding MIN/MAX within Groups

Next, let’s determine the minimum and maximum sale amount per product within the same collection.

db.sales.aggregate([
  {
    $group: {
      _id: '$product',
      minSale: { $min: '$amount' },
      maxSale: { $max: '$amount' }
    }
  }
])

This follows a similar pattern to the previous example but uses the $min and $max operators to find the lowest and highest sale amounts, respectively, for each product.

Calculating the Average

Now, for something slightly more complex, computing the average sale amount per product:

db.sales.aggregate([
  {
    $group: {
      _id: '$product',
      avgSale: { $avg: '$amount' }
    }
  }
])

The $avg operator accomplishes this, calculating the average of all amount values for each product grouping.

Advanced Grouping

Suppose your sales documents also include a date field, and you wish to aggregate sales not just by product, but also by year. MongoDB’s aggregation framework allows for this level of complexity.

db.sales.aggregate([
  {
    $group: {
      _id: { product: '$product', year: { $year: '$date' } },
      totalSales: { $sum: '$amount' }
    }
  }
])

This example groups sales by both product and the year the sale was made. The $year operator extracts the year part from the date field in each document.

Practical Application and Tips

Understanding how to perform these basic but powerful aggregations in MongoDB allows you to extract meaningful insights from your data. Here are a few tips to optimize your use of MongoDB’s aggregation framework:

  • Index fields that you frequently group by to improve performance.
  • Use the $match stage early in your pipeline to filter documents and reduce the data set size.
  • Consider the use of $project to reshape documents and reduce processing work in subsequent stages.

With these strategies and techniques, you’re well-equipped to perform efficient and powerful data aggregation tasks in MongoDB, enabling deeper analysis and insights from your data.