Sling Academy
Home/PostgreSQL/PostgreSQL: Ranking rows with RANK, DENSE_RANK, and ROW_NUMBER

PostgreSQL: Ranking rows with RANK, DENSE_RANK, and ROW_NUMBER

Last updated: January 05, 2024

Introduction

When working with data, it’s often necessary to order and rank rows based on specific criteria. PostgreSQL provides powerful ranking functions like RANK, DENSE_RANK, and ROW_NUMBER to make these tasks easier. In this tutorial, you’ll learn how to use these functions with progressively complex examples.

Understanding Ranking Functions

In PostgreSQL, ranking functions are a subset of window functions that assign a unique rank value to each row within a partition of a result set. Before diving into examples, let’s define each function:

  • RANK: Assigns a unique rank to each row, with gaps for ties.
  • DENSE_RANK: Similar to RANK, but without gaps in ranking numbers for tied values.
  • ROW_NUMBER: Assigns a unique number to each row, irrespective of ties, starting from 1.

Basic Usage

Let’s start with a simple example where we rank sales data:

SELECT salesperson_id,
       sales_amount,
       RANK() OVER (ORDER BY sales_amount DESC) AS sales_rank
FROM sales;

This query assigns a rank to each salesperson based on their sales_amount in descending order.

Ranking With Ties

Now let’s see how RANK handles ties:

SELECT salesperson_id,
       sales_amount,
       RANK() OVER (ORDER BY sales_amount DESC) AS sales_rank
FROM sales;

-- If two salespeople have the same sales_amount, they will receive the same rank, and the subsequent rank will be skipped.

DENSE_RANK, on the other hand, does not skip ranks:

SELECT salesperson_id,
       sales_amount,
       DENSE_RANK() OVER (ORDER BY sales_amount DESC) AS dense_sales_rank
FROM sales;

Partitioning Data

Partitioning allows ranking within subsets of data. Here we partition by region:

SELECT region,
       salesperson_id,
       sales_amount,
       RANK() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS regional_rank
FROM sales;

This will rank salespeople within each region separately.

ROW_NUMBER and its uniqueness

ROW_NUMBER assigns a distinct number to each row, which can be useful for pagination or where a strict sequence is required:

SELECT salesperson_id,
       sales_amount,
       ROW_NUMBER() OVER (ORDER BY sales_amount DESC) AS row_number
FROM sales;

Each salesperson will have a unique row number, regardless of ties in sales_amount.

Advanced Use Cases

Let’s combine these functions with other SQL features for more advanced analyses:

WITH ranked_sales AS (
    SELECT region,
           salesperson_id,
           sales_amount,
           RANK() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS rank,
           DENSE_RANK() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS dense_rank,
           ROW_NUMBER() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS row_number
    FROM sales
)
SELECT *
FROM ranked_sales
WHERE rank <= 3;

This CTE (Common Table Expression) allows us to filter and retrieve only the top 3 ranked sales within each region.

Combining with JOINs

You can also join ranked results with other tables to enrich your data:

WITH ranked_products AS (
    SELECT product_id,
           sales_amount,
           RANK() OVER (ORDER BY sales_amount DESC) AS rank
    FROM sales
)
SELECT p.name, rp.sales_amount, rp.rank
FROM ranked_products rp
JOIN products p ON rp.product_id = p.id
WHERE rp.rank <= 5;

Here, we’re joining the top 5 ranked products with the products table to get product names alongside their sales rank.

Conclusion

RANK, DENSE_RANK, and ROW_NUMBER are incredibly versatile functions in PostgreSQL that help in ranking and distributing rows across various criteria. By incorporating these functions into your SQL queries, you can solve complex data problems with ease and deliver insightful business intelligence data.

Next Article: PostgreSQL: REGEXP_MATCHES, REGEXP_REPLACE, REGEXP_SPLIT_TO_ARRAY, REGEXP_SPLIT_TO_TABLE

Previous Article: PostgreSQL: Full text search with tsvector and tsquery

Series: PostgreSQL Tutorials: From Basic to Advanced

PostgreSQL

You May Also Like

  • PostgreSQL with TimescaleDB: Querying Time-Series Data with SQL
  • PostgreSQL Full-Text Search with Boolean Operators
  • Filtering Stop Words in PostgreSQL Full-Text Search
  • PostgreSQL command-line cheat sheet
  • How to Perform Efficient Rolling Aggregations with TimescaleDB
  • PostgreSQL with TimescaleDB: Migrating from Traditional Relational Models
  • Best Practices for Maintaining PostgreSQL and TimescaleDB Databases
  • PostgreSQL with TimescaleDB: Building a High-Performance Analytics Engine
  • Integrating PostgreSQL and TimescaleDB with Machine Learning Models
  • PostgreSQL with TimescaleDB: Implementing Temporal Data Analysis
  • Combining PostgreSQL, TimescaleDB, and Airflow for Data Workflows
  • PostgreSQL with TimescaleDB: Visualizing Real-Time Data with Superset
  • Using PostgreSQL with TimescaleDB for Energy Consumption Analysis
  • PostgreSQL with TimescaleDB: How to Query Massive Datasets Efficiently
  • Best Practices for Writing Time-Series Queries in PostgreSQL with TimescaleDB
  • PostgreSQL with TimescaleDB: Implementing Batch Data Processing
  • Using PostgreSQL with TimescaleDB for Network Traffic Analysis
  • PostgreSQL with TimescaleDB: Troubleshooting Common Performance Issues
  • Building an IoT Data Pipeline with PostgreSQL and TimescaleDB