Using RANK() and DENSE_RANK() in MySQL 8: A Practical Guide

Updated: January 27, 2024 By: Guest Contributor Post a comment

Introduction

When working with data in MySQL, ranking rows within a result set is a common task. This is especially true when dealing with analytics and leaderboards. MySQL 8 introduces two very powerful functions to make this task easier: RANK() and DENSE_RANK(). In this guide, we will delve into how to use these functions to effortlessly rank data in various situations.

Understanding RANK()

The RANK() function is a window function that assigns a unique rank to each row within a partition of a result set, with the same rank for tie values, while leaving gaps between the ranks when there are tied ranks. It’s worth noting that RANK() will order rows over a specified column or columns.

SELECT 
    employee_name, 
    salary, 
    RANK() OVER (ORDER BY salary DESC) AS salary_rank 
FROM 
    employees;

In the SQL statement above, employees are ranked based on their salary in descending order. If two employees have the same salary, they receive the same rank, and the next rank is not sequential, reflecting the gap due to the tie.

Understanding DENSE_RANK()

Alternatively, the DENSE_RANK() function operates similarly to RANK(), but without gaps in the ranking sequence. This means it will give the same rank for tie values, but unlike RANK(), the next rank will be the consecutive integer.

SELECT 
    employee_name, 
    salary, 
    DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_rank_dense 
FROM 
    employees;

Using the same set of employee data, when ordered by salary in descending order, the DENSE_RANK() function will keep the ranks dense, with no gaps between the rank numbers even after ties.

Practical Examples

Example 1: Ranking Sales Data

Consider a table of sales data, we can rank the sales results to find the top-performing salespersons:

SELECT 
    salesperson_id, 
    total_sales, 
    RANK() OVER (ORDER BY total_sales DESC) AS sales_rank 
FROM 
    sales_data;

Example 2: Dense Ranking With Partitions

Sometimes data needs to be grouped, or partitioned, before being ranked. DENSE_RANK() can be particularly useful in partitioned data sets:

SELECT 
    department_id, 
    employee_name, 
    salary, 
    DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_salary_rank 
FROM 
    employees;

Here, each department’s employees are ranked by their salaries independently, allowing for a comparison strictly within departments.

When to Use RANK() vs. DENSE_RANK()

Deciding whether to use RANK() or DENSE_RANK() will depend on the result you are aiming to achieve. If it is important to reflect gaps due to tied ranks to, perhaps, signal clusters in the data, then RANK() is appropriate. However, if a continuous sequence of ranks is needed, DENSE_RANK() will fulfill this requirement without indicating the presence of ties through gaps.

Dealing with Ties

When ties occur, it may be necessary to define a secondary criterion to break them. This is done by adding additional columns to the ORDER BY clause:

SELECT 
    employee_name, 
    sales_region, 
    total_sales, 
    RANK() OVER (ORDER BY total_sales DESC, sales_region) AS sales_rank 
FROM 
    sales_data;

In this query, should there be a tie in total sales, the secondary column, sales_region, is used to break the tie.

Advanced Applications

RANK() and DENSE_RANK() functions can be a part of more complex analytical queries. They can be used alongside other window functions and aggregates to produce deep insights into your data.

Example:

SELECT 
    department_id,
    employee_name,
    salary,

    -- RANK() function to rank employees by salary within each department
    RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank,

    -- DENSE_RANK() function to provide a dense rank to each salary
    DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dense_salary_rank,

    -- AVG() as a window function to calculate average salary within each department
    AVG(salary) OVER (PARTITION BY department_id) AS avg_department_salary,

    -- SUM() as a window function to calculate total sales in each department
    SUM(sales) OVER (PARTITION BY department_id) AS total_department_sales

FROM 
    employees;

Code explained:

  • department_id, employee_name, salary: Basic columns selected from the employees table.
  • RANK() Function: Used to assign a rank to each employee based on their salary within their respective department. The highest salary gets the rank 1 in each department.
  • DENSE_RANK() Function: Similar to RANK(), but DENSE_RANK() does not create gaps in the ranking sequence when there are equal rank values.
  • AVG(salary) OVER: This calculates the average salary within each department, demonstrating how aggregate functions can be used as window functions.
  • SUM(sales) OVER: Calculates the total sales for each department.

This query is an example of how window functions like RANK() and DENSE_RANK() can be used in conjunction with other SQL features to perform complex data analysis and generate comprehensive insights from the data.

Note: Remember that the support for these functions depends heavily on the version of MySQL you are running. They are available from MySQL 8.0 onwards. Always ensure your database version supports these functions before using them.

Conclusion

Both RANK() and DENSE_RANK() are indispensable tools in a SQL developer’s toolkit when it comes to a ranking of results in MySQL. These functions not only ease the process of generating leaderboards and analytical data, but they also open up new possibilities for data comparison and insights. Mastery of these functionalities can significantly enhance the reporting and analytic capabilities of your SQL queries.