PostgreSQL: Combine results with UNION, INTERSECT, and EXCEPT

Updated: January 5, 2024 By: Guest Contributor Post a comment

Introduction

Within SQL, specifically PostgreSQL, combining result sets allows for more efficient data analysis and manipulation. Understanding how to adeptly use UNION, INTERSECT, and EXCEPT can enhance your query capabilities, leading to more insightful and simplified results.

Understanding UNION

The UNION operator is used to combine the result sets of two or more SELECT statements. It removes duplicate rows between the sets. The basic structure is:

SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;

Here’s a simple example. Suppose you have two tables, employees and contractors, both containing a column name.

SELECT name FROM employees
UNION
SELECT name FROM contractors;

Remember, column names and data types must line up across the individual SELECT statements.

UNION ALL

If you wish to keep all duplicates, you can use UNION ALL like so:

SELECT name FROM employees
UNION ALL
SELECT name FROM contractors;

Diving into INTERSECT

The INTERSECT operator produces the set of records that are common in both SELECT statements. Here’s the syntax:

SELECT column_name(s) FROM table1
INTERSECT
SELECT column_name(s) FROM table2;

Using our previous example of employees and contractors:

SELECT name FROM employees
INTERSECT
SELECT name FROM contractors;

This will return names that appear in both the employees and contractors tables.

Exploring EXCEPT

The EXCEPT operator retrieves records from the first SELECT statement that are not present in the second SELECT statement. Here it is in action:

SELECT name FROM employees
EXCEPT
SELECT name FROM contractors;

This query returns names from the employees table that are not found in the contractors table.

Combining Operators

It’s also possible to combine these operators within a single query to perform more complex data retrievals:

SELECT name FROM employees
UNION
SELECT name FROM contractors
EXCEPT
SELECT name FROM former_employees;

This query combines three result sets. It fetches all unique names from employees and contractors, then removes any names found in the former_employees table.

Using WITH Clauses

For more sophisticated queries, you can use WITH clause (Common Table Expressions) to simplify your operations:

WITH employee_names AS (
    SELECT name FROM employees
),
contractor_names AS (
    SELECT name FROM contractors
)
SELECT name FROM employee_names
UNION ALL
SELECT name FROM contractor_names;

This example creates two CTEs for employees and contractors names, and then combines them with a UNION ALL. This can clean up complex queries and improve readability.

Sorting and Filtering

After combining result sets, the ORDER BY and WHERE clauses can be used for sorting and filtering. Keep in mind that these should come after all combinations are made:

SELECT name FROM employees
UNION
SELECT name FROM contractors
ORDER BY name;

You may also nest combinations within a subquery when applying additional WHERE filters:

SELECT * FROM (
  SELECT name FROM employees
  UNION
  SELECT name FROM contractors
) AS combined_names
WHERE combined_names.name LIKE 'J%';

This filters the combined list for names starting with ‘J’.

Conclusion

Mastering the use of UNION, INTERSECT, and EXCEPT in PostgreSQL empowers you to create versatile and powerful queries. They help in combining, comparing, and contrasting datasets efficiently, enabling advanced data analysis.