MongoDB: Using $setIntersection to find common elements in arrays

Updated: February 3, 2024 By: Guest Contributor Post a comment

Introduction

MongoDB is a powerful NoSQL database that provides various operators to manipulate and analyze data. Among these operators, $setIntersection is a particularly useful one for finding common elements between arrays. This tutorial will cover the basics of using $setIntersection, offering a range of examples from simple to more complex use cases, to help you grasp how to apply it effectively in your MongoDB queries.

Working with $setIntersection

The $setIntersection operator takes an array of two or more arrays and returns an array that contains the elements that appear in every input array. It’s an aggregation pipeline operator used within stages like $project and $addFields to reshape documents or add new fields based on the intersection of array elements.

Basic Usage

Let’s start with a simple example to illustrate how $setIntersection works. Imagine you have a document structure in a collection “friends” where each document represents a person and their list of hobbies:

{
  "name": "John",
  "hobbies": ["Cycling", "Reading", "Photography"]
},
{
  "name": "Jane",
  "hobbies": ["Reading", "Gardening", "Photography"]
}

To find common hobbies between John and Jane, you could use a query like:

db.friends.aggregate([
  {
    $project: {
      commonHobbies: {
        $setIntersection: ["$hobbies", ["Reading", "Gardening", "Photography"]]
      }
    }
  }
]);

The result of this query would be:

[{
  "commonHobbies": ["Reading", "Photography"]
}]

This demonstrates how $setIntersection identifies ‘Reading’ and ‘Photography’ as common hobbies.

More Complex Use Cases

Moving onto more complex scenarios, suppose you have documents where each person also has a list of countries they’ve visited. To find individuals who have visited the same countries, you can nest $setIntersection within multiple aggregation stages.

Consider documents formatted as follows:

{
  "name": "John",
  "hobbies": ["Cycling", "Reading", "Photography"],
  "visitedCountries": ["France", "Germany", "Italy"]
},
{
  "name": "Jane",
  "hobbies": ["Reading", "Gardening", "Photography"],
  "visitedCountries": ["France", "Spain", "Italy"]
}

A query to find both common hobbies and visited countries might look like this:

db.friends.aggregate([
  {
    $project: {
      commonHobbies: {
       $setIntersection: ["$hobbies", ["Reading", "Gardening", "Photography"]]
      },
      commonCountries: {
        $setIntersection: ["$visitedCountries", ["France", "Germany", "Italy", "Spain"]]
      }
    }
  }
]);

The output would show:

[{
  "commonHobbies": ["Reading", "Photography"],
  "commonCountries": ["France", "Italy"]
}]

This example indicates how you can use $setIntersection to find intersections in multiple array fields within your documents.

Advanced Use Cases

For more advanced scenarios, $setIntersection can be nested with other operators for dynamic array comparison. Let’s assume now you want to dynamically compare hobbies between all documents without specifying them explicitly in the query:

db.friends.aggregate([
  {
    $lookup: {
      from: "friends",
      pipeline: [],
      as: "otherHobbies"
    }
  },
  {
    $project: {
      commonHobbies: {
        $setIntersection: ["$hobbies", "$otherHobbies.hobbies"]
      }
    }
  }
]);

This would require more advanced document reshaping, but essentially, it demonstrates the potential for dynamic comparison using $setIntersection in aggregation pipelines.

Conclusion

$setIntersection is a powerful tool within MongoDB’s aggregation framework, enabling you to find common elements between arrays efficiently. Whether you’re dealing with simple or complex data structures, mastering this operator can significantly enhance your data querying and analysis capabilities. Hopefully, this tutorial has provided you with the knowledge and confidence to start incorporating $setIntersection into your MongoDB queries.