MongoDB: Find documents whose field contains a substring (with examples)

Updated: February 3, 2024 By: Guest Contributor Post a comment

Introduction

MongoDB, a NoSQL database, is known for its flexibility when it comes to querying documents. In many situations, developers are faced with the need to search for documents where a specific field contains a given substring. This guide aims to demonstrate how you can achieve this with MongoDB queries, featuring a range of methods suited for different use cases.

Searching for substrings within a document field is a common need within database operations. MongoDB provides several means to achieve this: from basic string matching using regular expressions to more advanced text search functionalities.

Prerequisites

  • Installation of MongoDB server and MongoDB shell (mongoshell or a MongoDB GUI tool)
  • Basic understanding of MongoDB collections and documents
  • Familiarity with JavaScript as it pertains to MongoDB operations

Basic Substring Search Using Regular Expressions

One of the simpler ways to search for a string within a field in MongoDB is to use regular expressions (regex). Assume you have a collection named posts and you want to find all posts containing the word ‘MongoDB’ in their title field.

db.posts.find({ title: /MongoDB/i })

The /MongoDB/i syntax represents a case-insensitive search for the word ‘MongoDB’. The regular expression searches for any occurrence of the specified string within the title field.

Using the $regex Operator

For more options, the $regex operator enables you to compose more sophisticated regex queries.

db.posts.find({ title: { $regex: 'mongodb', $options: 'i' } })

This query does the same as the previous example but illustrates how to use the ‘$regex’ and ‘$options’ operators explicitly. Here, ‘$options: ‘i” means the search is case-insensitive. You can also use other options like ‘m’ for multiline matching, and ‘x’ for comment mode, among others.

Advanced Search using Text Indexes

For more advanced text searches, MongoDB provides text indexes. These indexes enable you to perform complex text searches on strings in documents within a collection. Before performing a text search, you need to create a text index on the fields you’re interested in searching.

db.posts.createIndex({ title: 'text' })

Now, with the text index in place, you can use the $text and $search operators to find documents that contain certain strings. The following example shows how to find documents with a title containing the word ‘database’.

db.posts.find({ $text: { $search: 'database' } })

When using the $search operator within a $text query you can also use quotes to specify exact phrases, and the minus sign to exclude words.

Case Sensitivity and Diacritic insensitivity in Text Searches

You can create text indexes that support case sensitivity and diacritic insensitivity by specifying the default_language and caseSensitive options:

db.posts.createIndex({ title: 'text' }, { default_language: 'none', caseSensitive: false })

Using Aggregation for String Searching

Aggregation is another approach to search strings in MongoDB. You can use the $match and $regex operators in an aggregation pipeline to filter documents. The following example pipeline returns all posts where the ‘title’ field contains the string ‘MongoDB’.

db.posts.aggregate([
    { $match: { title: { $regex: 'mongodb', $options: 'i' } } }
])

Searching with Partial Indexes

MongoDB allows the creation of partial indexes that only index the documents meeting a specified filter expression. This feature can create a targeted search experience, for example when you want to search only in documents with a certain flag or status, optimizing performance and reducing index size.

db.posts.createIndex({ title: 1 }, { partialFilterExpression: { status: 'published' } })

Utilizing the $indexOfBytes and $indexOfCP Aggregation Operators

For more precise control over string searches, MongoDB provides the $indexOfBytes and $indexOfCP operators as part of the aggregation framework. These operators return the starting index of the first occurrence of a substring in a string. The following example uses $indexOfBytes to find the position of the substring ‘Mongo’ within the ‘title’ field.

db.posts.aggregate([
    { $project: { title: 1, indexOfMongo: { $indexOfBytes: ['$title', 'Mongo'] } } }
])

This will add a field ‘indexOfMongo’ to each document in the result set, which is the zero-based index position of ‘Mongo’ in the ‘title’ field, or -1 if ‘Mongo’ is not found.

Conclusion

In conclusion, MongoDB offers versatile mechanisms for performing substring searches within document fields, ranging from simple regex operations to comprehensive text searches using indices and aggregation pipelines. By understanding the various options available and selecting the appropriate method for the task at hand, you can implement efficient and powerful search functionalities within your MongoDB-based applications.