Sling Academy
Home/MongoDB/MongoDB: Using $pipeline in $lookup (for Complex Joins)

MongoDB: Using $pipeline in $lookup (for Complex Joins)

Last updated: February 03, 2024

Introduction

In the world of NoSQL databases, MongoDB offers an impressive set of features that enable developers to work with data in a flexible and expressive way. One such feature is the $lookup operator, which provides a mechanism for performing join operations analogous to those in relational databases. With MongoDB 3.6 and higher, the power of $lookup has been significantly expanded by the addition of sub-pipelines within $lookup stages, which allows for the implementation of complex join conditions and transformations. In this tutorial, we will delve into how to utilize the $pipeline field of the $lookup operator to execute dynamic lookups and perform complex joins in MongoDB.

Understanding the Basics of $lookup

Before jumping into the complexities, let’s understand the basic form of $lookup. In its simplest format, $lookup can perform an equality match between a field from the documents in the collection on which you’re aggregating (the local field) and a field from the documents in the collection you’re joining with (the foreign field). The result is a new array of matched document(s) from the foreign collection inside each of the original documents.

db.orders.aggregate([
    {
        $lookup: {
            from: "items",
            localField: "itemId",
            foreignField: "_id",
            as: "itemDetails"
        }
    }
]);

This query would match an ‘itemId’ in the ‘orders’ collection with the ‘_id’ in the ‘items’ collection, embedding the matching ‘items’ documents inside the ‘order’ documents under the field ‘itemDetails’. It’s the MongoDB equivalent of a SQL LEFT OUTER JOIN.

Creating More Complex Joins Using $pipeline

Now, let’s enter the $pipeline option within $lookup. This feature allows you to go beyond simple equality matches and specify an aggregation pipeline that executes on the foreign collection. This pipeline defines more complex filtering, projection, and joining conditions, making $lookup a much more versatile operator.

db.orders.aggregate([
    {
        $lookup: {
            from: "items",
            let: { orderItemId: "$itemId", orderQty: "$quantity" },
            pipeline: [
                { $match: {
                    $expr: {
                        $and: [
                            { $eq: ["$_id",  "$orderItemId"] },
                            { $gte: ["$stockQty", "$orderQty"] }
                        ]
                    }
                }},
                { $project: { itemName: 1, price: 1 } }
            ],
            as: "itemDetails"
        }
    }
]);

This aggregation performs a more complex join that not only matches ‘itemId’ but also ensures that the quantity of the item in stock (‘stockQty’) is enough to satisfy the order quantity amount. The results include only the ‘itemName’ and ‘price’ fields for matching items. The $let operator allows us to declare variables that represent fields in the input documents, which we can then use in our pipeline stages that execute against the foreign collection.

Hands-on Examples of Using $pipeline in $lookup

Example 1: Filtering Joined Documents

db.orders.aggregate([
    {
        $lookup: {
            from: "customers",
            let: { orderId: "$_id", customerAge: 25 },
            pipeline: [
                { $match: {
                    $expr: {
                        $and: [
                            { $eq: ["$orderId", "$orders_id"] },
                            { $gt: ["$age", "$customerAge"] }
                        ]
                    }
                }},
                { $project: { _id: 0, name: 1, age: 1 } }
            ],
            as: "customerDetails"
        }
    }
]);

In this first example, we are performing a lookup from the orders collection to the customers collection to match orders with customers older than 25. We are projecting only the relevant fields of name and age into the resulting customerDetails array, excluding the _id field.

Example 2: Chains of Lookups

db.orders.aggregate([
    {
        $lookup: {
            from: "items",
            let: { itemId: "$itemId" },
            pipeline: [
                { $match: { $expr: { $eq: ["$_id",  "$itemId"] } } },
                { $lookup: {
                    from: "suppliers",
                    let: { supplierId: "$supplierId" },
                    pipeline: [
                        { $match: { $expr: { $eq: ["$_id", "$supplierId"] } } },
                        { $project: { _id: 0, name: 1, rating: 1 } }
                    ],
                    as: "supplierDetails"
                }},
                { $project: { _id: 0, itemName: 1, price: 1, supplierDetails: 1 } }
            ],
            as: "itemDetails"
        }
    }
]);

In this sophisticated example, we incorporated a second lookup within the pipeline of the first. This allows us to not only join the items and orders but to further join the supplier details, effectively creating a chain of lookups. This is powerful for modeling relational data structures in a non-relational database like MongoDB.

Example 3: Grouping and Summarizing

db.orders.aggregate([
    {
        $lookup: {
            from: "items",
            let: { orderItemId: "$itemId" },
            pipeline: [
                { $match: { $expr: { $eq: ["$_id", "$orderItemId"] } } },
                { $group: {
                    _id: "$categoryId",
                    totalSales: { $sum: { $multiply: ["$price", "$quantityOrdered"] } }
                }}
            ],
                    as: "itemSummary"
        }
    }
]);

Here we see an example of aggregation within the lookup by categorizing items and calculating total sales for each category. The $group stage is applied after the lookup match, which allows us to summarize data at a level that is more granular than the order level.

Final Notes on Performance and Use Cases

The flexibility offered by the $lookup operator with $pipeline support significantly boosts the ability to answer complex queries in MongoDB. However, use this feature judiciously as each lookup incurs a performance cost, especially as the complexity of the pipelines and the size of the data increase. Ensure that you create appropriate indexes to support your queries.

Conclusion

This tutorial has provided a deep dive into MongoDB’s $lookup operator with a particular focus on the $pipeline sub-feature. Through step-by-step examples, we examined how to craft basic to advanced joins and aggregate data efficiently. Effective use of $lookup enhances the overall flexibility of MongoDB, illustrating its capabilities as a robust option for handling complex queries and relational data patterns.

Next Article: How to perform cascade deletion in MongoDB (with examples)

Previous Article: MongoDB: Grouping results by multiple fields (with examples)

Series: MongoDB Tutorials

MongoDB

You May Also Like

  • MongoDB: How to combine data from 2 collections into one
  • Hashed Indexes in MongoDB: A Practical Guide
  • Partitioning and Sharding in MongoDB: A Practical Guide (with Examples)
  • Geospatial Indexes in MongoDB: How to Speed Up Geospatial Queries
  • Understanding Partial Indexes in MongoDB
  • Exploring Sparse Indexes in MongoDB (with Examples)
  • Using Wildcard Indexes in MongoDB: An In-Depth Guide
  • Matching binary values in MongoDB: A practical guide (with examples)
  • Understanding $slice operator in MongoDB (with examples)
  • Caching in MongoDB: A practical guide (with examples)
  • CannotReuseObject Error: Attempted illegal reuse of a Mongo object in the same process space
  • How to perform cascade deletion in MongoDB (with examples)
  • MongoDB: Using $not and $nor operators to negate a query
  • MongoDB: Find SUM/MIN/MAX/AVG of each group in a collection
  • References (Manual Linking) in MongoDB: A Developer’s Guide (with Examples)
  • MongoDB: How to see all fields in a collection (with examples)
  • Type checking in MongoDB: A practical guide (with examples)
  • How to query an array of subdocuments in MongoDB (with examples)
  • MongoDB: How to compare 2 documents (with examples)