MongoDB: How to filter array in subdocument

Updated: February 3, 2024 By: Guest Contributor Post a comment

Introduction

When working with MongoDB, a NoSQL database, you often need to deal with complex documents that contain arrays in subdocuments. Filtering such array elements is a crucial operation for querying documents effectively. This tutorial will guide you through multiple techniques to filter arrays in subdocuments, starting from basic to advanced examples.

Understanding the Data Model

First, let’s have a look at an example of a MongoDB document which we will be using throughout this tutorial:

{
    "_id": ObjectId("507f191e810c19729de860ea"),
    "title": "Your Favorite Book",
    "author": "John Doe",
    "categories": [
        {
            "name": "Fiction",
            "tags": ["adventure", "mystery"]
        },
        {
            "name": "Science", 
            "tags": ["education", "research"]
        }
    ]
}

Basic Filtering with $elemMatch

The $elemMatch operator allows you to match elements within an array that meet certain criteria. Here’s how you would find documents where the categories subdocument has at least one category with the tag ‘adventure’:

db.collection.find({
    "categories": {
        $elemMatch: {
            "tags": "adventure"
        }
    }
});

The above query would return documents that have any category with ‘adventure’ within their ‘tags’ array.

Projection to Filter Subdocument Arrays

If you’re only interested in the specific array elements that match your criteria, MongoDB’s projection can be used:

db.collection.find({
    "categories.name": "Fiction"
}, {
    "categories.$": 1
});

This query will return only the ‘Fiction’ elements of the ‘categories’ array in each document that matches.

Aggregation Framework

MongoDB’s Aggregation Framework yields more powerful capabilities for filtering and transforming documents. Let’s use $filter operator in the $project stage to filter the array.

db.collection.aggregate([
    { 
        $match: { "title": "Your Favorite Book" }
    },
    {
        $project: {
            categories: {
                $filter: {
                    input: "$categories",
                    as: "category",
                    cond: { $in: ["mystery", "$category.tags"] }
                }
            }
        }
    }
]);

This will output a document that includes only the categories with ‘mystery’ in their tags.

Nesting $filter for Deep Array Filtering

For more complex scenarios where the array to be filtered is nested within another array, you can nest the $filter operator like so:

db.collection.aggregate([
    {
        $project: {
            categories: {
                $map: {
                    input: "$categories",
                    as: "category",
                    in: {
                        name: "$category.name",
                        tags: {
                            $filter: {
                                input: "$category.tags",
                                as: "tag",
                                cond: { $eq: ["$tag", "mystery"] }
                            }
                        }
                    }
                }
            }
        }
    }
]);

This operation filters each ‘tags’ array within the ‘categories’ to only include the ‘mystery’ tag.

Indexing for Performance

To increase the performance of your queries, indexing the array fields that you often query or filter on is crucial.

db.collection.createIndex({"categories.tags": 1});

By indexing the ‘tags’ field within the ‘categories’ subdocument, the database can more efficiently locate the documents that match your filtering criteria.

Dealing with NULL or Missing Fields

Sometimes, the fields you want to filter on may be missing or contain null values. You can handle this situation using the $exists and $ne operators.

db.collection.find({
    "categories.tags": {
        $exists: true,
        $ne: null
    }
});

This will ensure that you only get documents where the ‘tags’ array exists and is not null.

Conclusion

Filtrating arrays in subdocuments in MongoDB can be an intricate task, but with the help of querying and filtering strategies presented in this guide, you can conquer the most common scenarios. Whether for simple queries or complex aggregations, understanding how to effectively manipulate arrays in subdocuments is a vital skill for any MongoDB user.