PostgreSQL: Group rows by day, week, month, and year

Updated: January 5, 2024 By: Guest Contributor Post a comment

Introduction

Organizing data by time intervals is a common task in data analysis and reporting. PostgreSQL, a powerful open-source relational database, offers various functions to group rows by different time frames like days, weeks, months, and years for insightful querying.

Data Setup

Before diving into the examples, let’s set up a sample table named sales that we’ll use throughout this tutorial:

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    product_id INT NOT NULL,
    quantity INT NOT NULL,
    sale_date TIMESTAMP NOT NULL
);

Now, populate the sales table with some hypothetical data:

INSERT INTO sales (product_id, quantity, sale_date)
VALUES
    (1, 10, '2023-01-01 14:30:00'),
    (2, 15, '2023-01-07 11:00:00'),
    ... -- (Add more rows as needed)

Grouping By Days

To group rows by days, use the DATE function to truncate the timestamp to a date:

SELECT
    DATE(sale_date) AS sale_day,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_day
ORDER BY
    sale_day;

Grouping By Weeks

You can group rows by weeks using the EXTRACT function in combination with the DATE_TRUNC function:

SELECT
    EXTRACT(WEEK FROM sale_date) AS sale_week,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_week
ORDER BY
    sale_week;

Alternatively, for a more precise grouping that respects ISO week dates, consider:

SELECT
    DATE_TRUNC('week', sale_date)::date AS sale_week_start,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_week_start
ORDER BY
    sale_week_start;

Grouping By Months

To group by months:

SELECT
    DATE_TRUNC('month', sale_date)::date AS sale_month,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_month
ORDER BY
    sale_month;

Grouping By Years

Similarly, to group by years:

SELECT
    DATE_TRUNC('year', sale_date)::date AS sale_year,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_year
ORDER BY
    sale_year;

Advanced Groupings

For more advanced scenarios, such as grouping sales by the last day of the month or combining multiple groupings:

-- Group by the last day of the month
SELECT
    (DATE_TRUNC('month', sale_date) + INTERVAL '1 MONTH - 1 day')::date AS sale_month_end,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_month_end;

-- Group by multiple time frames
SELECT
    EXTRACT(YEAR FROM sale_date) AS sale_year,
    EXTRACT(MONTH FROM sale_date) AS sale_month,
    COUNT(*) AS total_sales,
    SUM(quantity) AS total_quantity
FROM
    sales
GROUP BY
    sale_year, sale_month;

Using Timezone-Aware Queries

If your application requires timezone-specific queries, use the AT TIME ZONE:

SELECT
    DATE_TRUNC('day', sale_date AT TIME ZONE 'UTC') AT TIME ZONE 'America/New_York' AS sale_day_ny,
    COUNT(*) AS total_sales
FROM
    sales
GROUP BY
    sale_day_ny;

Performance Considerations

Grouping by time intervals could be performance-intensive on large datasets. Indexing the columns and potentially using partitioning strategies can mitigate performance issues.

Using GROUP BY with JOINs

When combining GROUP BY with JOIN, ensure the join keys are indexed and consider using subqueries or CTEs for complex queries.

Conclusion

Grouping rows by time intervals like days, weeks, months, and years is a fundamental yet powerful capability of PostgreSQL, enabling robust data analysis and reporting. Remember to consider performance in your designs and tailor your queries to your specific needs for the best results.