MySQL 8: Using OVER() to define a window frame in a query

Updated: January 27, 2024 By: Guest Contributor Post a comment

Overview

When it comes to complex data analysis in SQL, window functions provide a powerful toolset for working with aggregated data in a more granular yet encompassing manner. MySQL introduced window functions in version 8.0, aligning with other major database management systems like PostgreSQL and SQL Server.

In this tutorial, we will explore the fundamentals of window functions with a particular focus on the OVER() clause to define the window frame. Our journey will take us from basic usage scenarios to more advanced applications, illustrating how window functions can be an integral part of your SQL toolkit.

What Are Window Functions?

Window functions operate on a set of rows and return a value for each row based on the calculation performed over the set, or ‘window’. These functions can perform ranking, aggregation, and running calculations while still returning the detail of each row within the partition of the data.

Basic Example of OVER() Clause

SELECT employee_id,
       department,
       salary,
       AVG(salary) OVER (PARTITION BY department) AS avg_department_salary
FROM employees;

This query will return the employee id, department, salary, and the average salary of the department for each employee. Each row retains individual salary information while also showing the computed average for the department it belongs to.

Defining a Window Frame

The following clause refines how the window function will behave:

OVER (
    PARTITION BY column_name1
    ORDER BY column_name2
    ROWS BETWEEN start_point AND end_point
)

Here ‘PARTITION BY’ divides the result set into partitions to which the window function is applied. ‘ORDER BY’ dictates the order in which rows are processed. ‘ROWS BETWEEN’ defines the set of rows in a ‘frame’ to perform the function.

Ranking Functions

SELECT employee_id,
       salary,
       RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

This query returns the employee id, the salary, and a rank for every row, indicating salary order from highest to lowest within the entire dataset.

Running Total Example

SELECT order_id,
       order_date,
       order_value,
       SUM(order_value) OVER (ORDER BY order_date
                               ROWS UNBOUNDED PRECEDING) AS running_total
FROM orders;

This gives us the running total of all order values to date for each row ordered by ‘order_date’.

Advanced Window Framing

Window functions reach their full potential when you start framing data with boundaries.

SELECT employee_id,
       department,
       salary,
       AVG(salary) OVER (PARTITION BY department
                         ORDER BY salary
                         ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS avg_nearby_salaries
FROM employees;

This uses two frames, ‘1 PRECEDING’ and ‘1 FOLLOWING’, to compute the average (window) for the salary of each employee against the nearest colleagues in the same department.

Window Frame Exclusion

MySQL 8 allows for excluding certain rows from the window frame calculation with the keyword ‘EXCLUDE’. For instance:

SELECT employee_id,
       department,
       salary,
       SUM(salary) OVER (
            PARTITION BY department
            ORDER BY salary
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
            EXCLUDE CURRENT ROW
       ) AS cumulative_salary_except_current
FROM employees;

In this example, the cumulative sum for salaries is calculated for each department, excluding the current row’s salary from the total.

Use of Multiple Windows

SELECT employee_id,
       department,
       salary,
       AVG(salary) OVER w AS avg_department_salary,
       MAX(salary) OVER w AS max_department_salary
FROM employees
WINDOW w AS (PARTITION BY department ORDER BY salary);

‘WINDOW’ clause is used to define a named window that can be reused, making the query cleaner and easier to read.

Combining Aggregate and Non-Aggregate Queries

SELECT employee_id,
       First_Value(employee_id) OVER w AS first_employee_in_department,
       salary,
       SUM(salary) OVER (PARTITION BY department) AS department_total_salary
FROM employees
WINDOW w AS (PARTITION BY department ORDER BY salary);

Here, the ‘First_Value’ window function is used along with an aggregate function, both over the same window.

Conclusion

In this tutorial, we have covered the essentials of using the OVER() clause to define a window frame in a query with MySQL 8. We explored various window functions, learned how to partition data, order it, and define the scope of our calculations. Mastering window functions will enhance your capabilities to perform complex calculations and analysis within your SQL queries effectively.