MongoDB: Using lookUp() to merge reference relations

Updated: February 1, 2024 By: Guest Contributor Post a comment

The $lookup Stage

The $lookup stage in the aggregation pipeline allows us to perform a left outer join to a collection in the same database to filter in documents from the joined collection for processing. The basic $lookup syntax is as follows:

db.collection.aggregate([
  {
    $lookup:
    {
      from: "collection to join", 
      localField: "field from the input documents",
      foreignField: "field from the documents of the 'from' collection",
      as: "output array field"
    }
  }
]);

Let’s write a query to join books with authors:

db.books.aggregate([
  {
    $lookup:
    {
      from: "authors", 
      localField: "author_id",
      foreignField: "_id",
      as: "authorDetails"
    }
  }
]).pretty();

This would produce documents where each book now includes an array called ‘authorDetails’ containing the joined author document(s).

Handling Multiple Matches

What happens if our join condition matches multiple documents? In such cases, the $lookup stage will append all matching documents to the output array. For our example, since each book has a single author, we should get exactly one match per book. However, if you’re referencing a collection with the possibility of multiple matches and need to handle this, MongoDB will naturally handle this by appending each matching document to the specified array.

Combining $lookup with Other Stages

The real power of the aggregation pipeline comes from combining various stages. Let’s use $lookup with $match to filter results:

db.books.aggregate([
  {
    $match: { title: 'Pride and Prejudice' }
  },
  {
    $lookup:
    {
      from: 'authors',
      localField: 'author_id',
      foreignField: '_id',
      as: 'authorDetails'
    }
  }
]).pretty();

We first filter books where the title is ‘Pride and Prejudice’ and then perform our $lookup.

Optimizing $lookup Performance

Although $lookup stages are powerful, they may cause performance hits, especially with large datasets or many lookups. Optimizations include:

  • Making sure that the foreign field (the field from the joined collection) is indexed.
  • Limiting the amount of data processed at each stage by using $match, $project, etc., before the $lookup stage.

Conclusion

In this tutorial, we’ve seen how to use the $lookup aggregation stage in MongoDB to join documents from separate collections, effectively ‘merging’ reference relations. By mastering $lookup and combining it with other pipeline stages, you can perform complex data retrievals and transformations that can harness the full power of MongoDB’s flexible document model.