MySQL 8: Find the sum/average of values in each group

Updated: January 26, 2024 By: Guest Contributor Post a comment

Introduction

MySQL, as a leading open-source relational database management system (RDBMS), is a cornerstone of modern web development and data analysis. Grouping data and calculating aggregate functions such as the sum and the average within each group is a common requirement for data reporting and analysis. In this tutorial, we will explore how to apply these aggregation functions in MySQL 8 using the SUM() and AVG() functions with the GROUP BY clause. By the end of this guide, you will be equipped to efficiently summarize and interpret grouped data in your MySQL databases.

Basic Group Aggregations with SUM and AVG

To get started, let’s consider a scenario where we have a table named sales_data with the columns id, product_id, quantity_sold, and date. We want to find the total quantity sold of each product. We can achieve this by grouping the rows based on the product_id and then calculating the sum:

SELECT product_id,
       SUM(quantity_sold) AS total_quantity
FROM sales_data
GROUP BY product_id;

This query will group rows with the same product_id and calculate the sum of quantity_sold for each group. The result will look something like this:

product_id | total_quantity
------------+---------------
      1     |       150
      2     |        90

Similarly, to calculate the average quantity sold by product, we can use the following query:

SELECT product_id,
       AVG(quantity_sold) AS average_quantity
FROM sales_data
GROUP BY product_id;

Results may appear as follows:

product_id | average_quantity
------------+------------------
      1     |      50
      2     |      45

Incorporating WHERE Clause Filters

Sometimes, you may want to calculate aggregates only for certain rows that meet specific criteria. You can incorporate a WHERE clause in the query to filter rows before grouping:

SELECT product_id,
       SUM(quantity_sold) AS total_quantity
FROM sales_data
WHERE date >= '2023-01-01'
GROUP BY product_id;

This modification restricts the sum calculation to rows where the date is on or after January 1, 2023.

Advanced Grouping with ROLLUP

The ROLLUP modifier allows us to perform multiple levels of grouping, providing subtotals and a grand total within the result set:

SELECT product_id,
       SUM(quantity_sold) AS total_quantity
FROM sales_data
GROUP BY product_id WITH ROLLUP;

The resulting output includes an additional row representing the grand total:

product_id | total_quantity
------------+---------------
      1     |       150
      2     |        90
   NULL     |       240

Using JOINs with Group Aggregation

Complex reporting often requires joining multiple tables before computing aggregates. For example, consider a second table products with columns: id, name, and price. To calculate the total sales revenue for each product, you’d need to join the sales_data and products tables:

SELECT p.name,
       SUM(s.quantity_sold * p.price) AS total_revenue
FROM sales_data s
JOIN products p ON s.product_id = p.id
GROUP BY p.name;

This query calculates the revenue by multiplying the quantity sold by the product price and then grouping by product name.

Advanced Functionality: Filtering Group Aggregates with HAVING

The HAVING clause in MySQL is used to filter groups based on a condition related to an aggregate function. For example, to find products where the average quantity sold exceeds a certain threshold:

SELECT product_id,
       AVG(quantity_sold) AS average_quantity
FROM sales_data
GROUP BY product_id
HAVING AVG(quantity_sold) > 50;

This query only outputs groups that have an average quantity sold greater than 50.

Performance Tips

Computation of aggregates can be resource-intensive on large datasets. To improve performance, consider the following:

  • Indexing the columns that are frequently used in GROUP BY clauses.
  • Minimize the use of JOIN operations by denormalizing data where appropriate.
  • Analyze and fine-tune your queries using the EXPLAIN statement.

Conclusion

In conclusion, MySQL’s SUM() and AVG() functions paired with GROUP BY are powerful tools for querying grouped data. Whether you’re aggregating sales numbers, calculating averages, or summarizing large datasets, MySQL provides a robust set of functionalities to help you extract meaningful insights from your data.