Sling Academy
Home/MongoDB/MongoDB: Select/count distinct values in multiple fields

MongoDB: Select/count distinct values in multiple fields

Last updated: February 03, 2024

Introduction

MongoDB, as a leading NoSQL database, offers you the flexibility to work with data in documents that resemble JSON objects. One common task when dealing with databases is selecting and counting distinct values. This tutorial delves into how to perform these operations in MongoDB, focusing on distinct values in multiple fields, a task which can have varied levels of complexity depending on the structure of your data.

We’ll start with the basics and gradually move to more advanced queries, complete with code examples and potential outputs. Let’s dive in!

Prerequisites

  • A running MongoDB instance
  • Basic familiarity with the MongoDB shell or Compass UI
  • A dataset in your MongoDB where you can test the queries

Using distinct function

The distinct function in MongoDB is straightforward for querying distinct values in a single field. It’s equivalent to the SELECT DISTINCT statement in SQL.

db.collection.distinct('fieldName')

Let’s say we have a collection products with the following documents:

[
  { "_id" : 1, "category": "electronics", "manufacturer": "ABC Corp" },
  { "_id" : 2, "category": "home appliances", "manufacturer": "XYZ Inc" },
  { "_id" : 3, "category": "electronics", "manufacturer": "ABC Corp" }
]

To get distinct categories, we’d use:

db.products.distinct('category')

The output would be:

[
  "electronics",
  "home appliances"
]

Combining $group and $addToSet

For combining distinct values across multiple fields, we can rely on the aggregation framework provided by MongoDB.
An aggregation pipeline transforms data into aggregated results using stages such as $group, $project, $match, and more.

Here’s a basic example to combine distinct manufacturers and categories:

db.products.aggregate([
  {
    $group: {
      _id: null,
      distinctiveCategories: { $addToSet: "$category" },
      distinctiveManufacturers: { $addToSet: "$manufacturer" }
    }
  }
])

This pipeline would produce:

{
  "_id" : null,
  "distinctiveCategories" : [ "electronics", "home appliances" ],
  "distinctiveManufacturers" : [ "ABC Corp", "XYZ Inc" ]
}

Advanced Aggregation

If we need to perform a more complex operation, such as counting the distinct values, we can extend our aggregation pipeline with additional stages like $project or $unwind.

Here is an example that counts the number of distinct categories and manufacturers:

db.products.aggregate([
  {
    $group: {
      _id: null,
      distinctiveCategories: { $addToSet: "$category" },
      distinctiveManufacturers: { $addToSet: "$manufacturer" }
    }
  },
  {
    $project: {
      _id: 0,
      numberOfDistinctCategories: { $size: "$distinctiveCategories" },
      numberOfDistinctManufacturers: { $size: "$distinctiveManufacturers" }
    }
  }
])

This query will yield:

{
  "numberOfDistinctCategories" : 2,
  "numberOfDistinctManufacturers" : 2
}

The previous examples cover a general case. However, your dataset might require more complex querying. MongoDB allows the combination of various operators in different stages to create powerful aggregations.

Working with Nested Documents

Sometimes, your value might be nested within another document or array. In such cases, you need to use the $unwind operator to flatten the array before grouping the entries.

For example, suppose we have the following documents:

[
  { "_id" : 1, "products": [ { "category": "electronics" }, { "category": "fitness" } ] },
  { "_id" : 2, "products": [ { "category": "home appliances" } ] }
  { "_id" : 3, "products": [ { "category": "gaming" } ] }
]

To count the distinct categories present in the nested products array, we would use an aggregation like this:

db.orders.aggregate([
  { $unwind: "$products" },
  { $group: { _id: null, distinctCategories: { $addToSet: "$products.category" } } },
  { $project: { _id: 0, count: { $size: "$distinctCategories" } } }
])

This would return:

{ "count" : 4 }

Conclusion

Throughout this guide, we’ve explored selecting and counting distinct values in MongoDB. Starting with the simple distinct method, advancing through aggregation with $group and $addToSet, and considering more complex scenarios involving nested documents and arrays. Whether you are dealing with flat or hierarchical data structures, MongoDB’s flexible aggregation framework can satisfy a wide range of querying needs.

In summary, MongoDB provides a robust set of tools that enable developers to efficiently handle unique value extraction and aggregation tasks across various complexities. These capabilities are crucial for data analysis, ensuring data integrity, and generating insightful reports.

Next Article: Using variables in MongoDB aggregation pipeline (with examples)

Previous Article: When not to use MongoDB? Here’re 7 common scenarios

Series: MongoDB Tutorials

MongoDB

You May Also Like

  • MongoDB: How to combine data from 2 collections into one
  • Hashed Indexes in MongoDB: A Practical Guide
  • Partitioning and Sharding in MongoDB: A Practical Guide (with Examples)
  • Geospatial Indexes in MongoDB: How to Speed Up Geospatial Queries
  • Understanding Partial Indexes in MongoDB
  • Exploring Sparse Indexes in MongoDB (with Examples)
  • Using Wildcard Indexes in MongoDB: An In-Depth Guide
  • Matching binary values in MongoDB: A practical guide (with examples)
  • Understanding $slice operator in MongoDB (with examples)
  • Caching in MongoDB: A practical guide (with examples)
  • CannotReuseObject Error: Attempted illegal reuse of a Mongo object in the same process space
  • How to perform cascade deletion in MongoDB (with examples)
  • MongoDB: Using $not and $nor operators to negate a query
  • MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection
  • References (Manual Linking) in MongoDB: A Developer’s Guide (with Examples)
  • MongoDB: How to see all fields in a collection (with examples)
  • Type checking in MongoDB: A practical guide (with examples)
  • How to query an array of subdocuments in MongoDB (with examples)
  • MongoDB: How to compare 2 documents (with examples)