PyMongo: Find documents within a day/week/month/year

Updated: February 8, 2024 By: Guest Contributor Post a comment

Introduction

Working with dates in MongoDB can seem daunting at first, but it’s a crucial aspect of dealing with any kind of time-sensitive data. Whether you’re building a blogging platform, a scheduling app, or any application that relies on date and time, MongoDB offers powerful tools for querying documents based on dates. In this tutorial, we will learn how to use PyMongo, the Python driver for MongoDB, to find documents within specific time frames such as a day, week, month, or year.

Prerequisites

  • Python 3.6 or newer installed on your system.
  • MongoDB server running and accessible.
  • PyMongo installed in your Python environment. You can install it using pip install pymongo.

Finding Documents Within a Day

Let’s start with the most basic query: finding documents that were created within a specific day. We’ll assume you have a collection called events with a date field in each document.

from pymongo import MongoClient
from datetime import datetime, timedelta

def find_documents_within_a_day(collection, specific_day):
    start_of_day = specific_day.replace(hour=0, minute=0, second=0, microsecond=0)
    end_of_day = specific_day.replace(hour=23, minute=59, second=59, microsecond=999999)
    return list(collection.find({
        'date': {
            '$gte': start_of_day,
            '$lt': end_of_day
        }
    }))

# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')
# Accessing the database
db = client['your_database_name']
# Selecting the collection
events = db.events

# Example usage
specific_day = datetime.now() - timedelta(days=1) # Yesterday
results = find_documents_within_a_day(events, specific_day)
print(results)

This function takes a collection and a specific_day datetime object as arguments. It constructs a query that looks for documents where the date field is greater than or equal to the start of the specified day and less than the end of that day. This effectively retrieves all documents from that specific day.

Finding Documents Within a Week/Month/Year

To query documents within a week, month, or year, we’ll need to adjust our approach slightly. The key here is to calculate the start and end dates of the period we’re interested in. Let’s expand our previous function to make it more versatile.

from pymongo import MongoClient
from datetime import datetime, timedelta
import calendar

def find_documents_within_period(collection, base_date, period='day'):
    start_date = base_date
    if period == 'week':
        start_date = start_date - timedelta(days=start_date.weekday())
    elif period == 'month':
        start_date = start_date.replace(day=1)
    elif period == 'year':
        start_date = start_date.replace(month=1, day=1)
    # setting the beginning of the period
    start_of_period = start_date.replace(hour=0, minute=0, second=0, microsecond=0)
    # calculating the end of the period
    if period == 'day':
        end_of_period = start_date.replace(hour=23, minute=59, second=59, microsecond=999999)
    elif period == 'week':
        end_of_period = start_date + timedelta(days=6, hours=23, minutes=59, seconds=59, microseconds=999999)
    elif period == 'month':
        month_last_day = calendar.monthrange(start_date.year, start_date.month)[1]
        end_of_period = start_date.replace(day=month_last_day, hour=23, minute=59, second=59, microsecond=999999)
    elif period is 'year':
        end_of_period = start_date.replace(month=12, day=31, hour=23, minute=59, second=59, microsecond=999999)
    return list(collection.find({
        'date': {
            '$gte': start_of_period,
            '$lte': end_of_period
        }
    }))

# Example usage
now = datetime.now()
client = MongoClient('mongodb://localhost:27017/')
 db = client['your_database_name']
 events = db.events
 
# Finding documents within the current month
results_month = find_documents_within_period(events, now, 'month')
print('Documents within the current month:', results_month)

This versatile function allows you to query documents within any given day, week, month, or year by simply providing a base date and specifying the period.

Advanced Queries

Using PyMongo, you can perform advanced queries to find documents within specific time periods (e.g., day, week, month, year) and based on additional criteria such as documents created by a certain user. For these types of queries, you’ll often use MongoDB’s $gte (greater than or equal) and $lt (less than) operators along with date manipulation to define the time range.

Here’s an advanced code example that demonstrates how to find documents within a day, week, month, and year, all filtered by additional conditions like a specific user. This example assumes you have a collection with documents that contain a createdAt timestamp and a username field:

from pymongo import MongoClient
from datetime import datetime, timedelta

# Connect to MongoDB (adjust connection string as needed)
client = MongoClient('mongodb://localhost:27017/')
db = client['your_database']
collection = db['your_collection']

# Define a base date for the example (e.g., now)
base_date = datetime.now()

# Helper function to find documents within a time period
def find_documents(time_period, additional_criteria):
    start_date, end_date = None, None

    if time_period == 'day':
        start_date = base_date.replace(hour=0, minute=0, second=0, microsecond=0)
        end_date = start_date + timedelta(days=1)
    elif time_period == 'week':
        start_week = base_date - timedelta(days=base_date.weekday())
        start_date = start_week.replace(hour=0, minute=0, second=0, microsecond=0)
        end_date = start_date + timedelta(weeks=1)
    elif time_period == 'month':
        start_date = base_date.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
        if base_date.month == 12:
            end_date = datetime(base_date.year + 1, 1, 1)
        else:
            end_date = datetime(base_date.year, base_date.month + 1, 1)
    elif time_period == 'year':
        start_date = datetime(base_date.year, 1, 1)
        end_date = datetime(base_date.year + 1, 1, 1)

    query = {'createdAt': {'$gte': start_date, '$lt': end_date}, **additional_criteria}
    return list(collection.find(query))

# Example usage
documents_today = find_documents('day', {'username': 'john_doe'})
documents_this_week = find_documents('week', {'username': 'john_doe'})
documents_this_month = find_documents('month', {'username': 'john_doe'})
documents_this_year = find_documents('year', {'username': 'john_doe'})

# Print out the counts (for demonstration)
print(f"Today: {len(documents_today)}")
print(f"This Week: {len(documents_this_week)}")
print(f"This Month: {len(documents_this_month)}")
print(f"This Year: {len(documents_this_year)}")

In this example:

  • A helper function find_documents is defined to encapsulate the logic for computing the start and end dates for each time period (day, week, month, year) and constructing the query with both the time period and additional criteria (e.g., documents created by john_doe).
  • The function calculates the start and end dates for the specified time period and constructs a query that combines these dates with any additional criteria provided.
  • This approach makes it easy to extend the functionality to include more conditions or adjust the time periods as needed.

Conclusion

Through this tutorial, you should now have a good grasp on how to perform date-based queries in MongoDB using PyMongo. These techniques will allow you to create more dynamic and responsive applications by leveraging time-sensitive data. With practice, these queries will become an indispensable part of your MongoDB toolkit.