PostgreSQL: FIRST_VALUE, LAST_VALUE, and NTH_VALUE

Updated: January 6, 2024 By: Guest Contributor Post a comment

Introduction

Exploring the intricacies of window functions in PostgreSQL, this tutorial delves into FIRST_VALUE, LAST_VALUE, and NTH_VALUE, detailing their uses and advantages in database querying with practical examples.

Understanding Window Functions

Window functions in PostgreSQL provide a way to perform calculations across rows that are related to the current row. These are particularly useful when it comes to analyzing data sets where the context of a row in relation to others in a set matters. Unlike aggregate functions, window functions do not cause rows to become grouped into a single output row — they maintain separate rows. Before diving into specifics, let’s get familiar with the basic syntax of a window function:

SELECT col1, col2, ..., window_function(colx) OVER ( [PARTITION BY col] [ORDER BY col] ) FROM table;

FIRST_VALUE

FIRST_VALUE is a function that allows you to return the first value in an ordered set of values. This can be useful for returning values from the beginning of a partition or a dataset.

Basic Usage of FIRST_VALUE

Let’s start with a simple example: finding the first salary in an employees table when ordered by hire date.

SELECT
  name,
  hire_date,
  salary,
  FIRST_VALUE(salary) OVER (ORDER BY hire_date) AS first_salary
FROM
  employees;

In this query, the FIRST_VALUE is used to fetch the earliest salary recorded in the hiring order. Moving on, these functions can be far more powerful when used with the PARTITION BY clause:

SELECT
  department,
  name,
  salary,
  FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY hire_date) AS first_department_salary
FROM
  employees;

Here, the first salary of each department is retrieved, effectively giving us the ability to compare an individual’s salary with the first salary in their department.

LAST_VALUE

The LAST_VALUE function retrieves the last value in an ordered set. Unlike FIRST_VALUE, LAST_VALUE can be a bit trickier to use correctly because of its sensitivity to the window frame’s specification.

Using LAST_VALUE

An initial attempt to get the most recent salary in the employees’ table might use a simple LAST_VALUE function:

SELECT
  name,
  hire_date,
  salary,
  LAST_VALUE(salary) OVER (ORDER BY hire_date) AS latest_salary
FROM
  employees;

This will not produce the expected result because, by default, the frame end is set to CURRENT ROW. To fetch the actual latest salary, the correct syntax requires specifying the frame:

SELECT
  name,
  hire_date,
  salary,
  LAST_VALUE(salary) OVER (ORDER BY hire_date RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS latest_salary
FROM
  employees;

This corrects the issue and we would obtain the most recent salary regardless of the hire_date record position.

NTH_VALUE

NTH_VALUE is a function that returns the value of the specified column at the nth row of the window frame (partition). It’s an extension of the concept applied in FIRST_VALUE and LAST_VALUE, granting specific offset selection.

Basic NTH_VALUE Example

To get the third highest salary from a set of records ordered by the salary amount, you’d write:

SELECT
  name,
  salary,
  NTH_VALUE(salary, 3) OVER (ORDER BY salary DESC) AS third_highest_salary
FROM
  employees;

But be cautious of framing with NTH_VALUE too. Just like with LAST_VALUE, you must define the frame to ensure that the nth value can be reached from any row:

SELECT
  name,
  salary,
  NTH_VALUE(salary, 3) OVER (ORDER BY salary DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS third_highest_salary
FROM
  employees;

Advanced Example: Using FIRST_VALUE, LAST_VALUE, and NTH_VALUE

Imagine a scenario where you need to report not just the first and last salary, but also the third highest salary in each department of a company. Combining all three functions, we can craft the following query:

SELECT
  department,
  name,
  salary,
  FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY hire_date) AS first_department_salary,
  LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY hire_date RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_department_salary,
  NTH_VALUE(salary, 3) OVER (PARTITION BY department ORDER BY salary DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS third_highest_department_salary
FROM
  employees;

This complex query gives us a comprehensive view of the salary structure within each department, using the power of window functions.

Conclusion

In conclusion, FIRST_VALUE, LAST_VALUE, and NTH_VALUE are extremely useful PostgreSQL window functions for accessing specific data point within partitions and ordered sets. Through careful usage and consideration of the window frame definition, one can extract valuable insights into data trends and anomalies.