PostgreSQL: Using json_agg to aggregate data

Updated: January 6, 2024 By: Guest Contributor Post a comment

Introduction

JSON aggregation in PostgreSQL is a powerful tool to structure query results in JSON format directly, often useful for API responses. With json_agg, we can summarize data as a JSON array of records.

Getting Started with json_agg

The json_agg function in PostgreSQL takes a record or a value as its argument and aggregates the results into a JSON array. Here’s a simple example:

SELECT json_agg(name) FROM customers;

Assuming customers is a table with a column name, this query would return all customer names as a JSON array.

Aggregating Multiple Columns

Not limited to a single column, json_agg can encapsulate multiple columns:

SELECT json_agg(row(name, email)) AS customer_info FROM customers;

With this, each customer’s name and email are aggregated into a JSON array of records.

Nested JSON Structures

You’re not confined to basic arrays. Nested JSON objects can be achieved like this:

SELECT json_agg(
    JSON_BUILD_OBJECT(
        'name', name,
        'email', email,
        'orders', (SELECT json_agg(product_name) FROM orders WHERE orders.customer_id = customers.id)
    )
) AS customer_data
FROM customers;

This query gathers a JSON array of customers, where each customer object includes an array of ordered products.

Joining Tables

json_agg becomes dramatically more powerful when you start joining tables:

SELECT categories.name, json_agg(products.*)
FROM categories
JOIN products ON products.category_id = categories.id
GROUP BY categories.name;

Every category will have an associated JSON array of products.

Filtering Results

You can also filter which records to include in your JSON array:

SELECT json_agg(
    JSON_BUILD_OBJECT(
        'name', name,
        'price', price
    ))
FROM products
WHERE price > 50.00;

All products costing more than $50 are included in the aggregated JSON.

Adding Order to Aggregated Data

To order the data within the json_agg, use a subquery:

SELECT json_agg(product ORDER BY price DESC) FROM (
    SELECT JSON_BUILD_OBJECT('name', name, 'price', price) AS product FROM products
) sub;

The resulting JSON array is ordered by price in descending order.

Handling NULL values

What if you have NULL values in your dataset? You can choose how to represent these using coalesce:

SELECT json_agg(
    COALESCE(name, 'Unknown name')
) FROM customers;

null names are now represented as ‘Unknown name’ in the JSON array.

Conclusion

The use of json_agg to produce structured JSON directly from SQL queries opens up great flexibility for delivering data. It is indispensable for modern, data-driven applications where PostgreSQL can hand off well-formed JSON to backend services.