PostgreSQL: 4 ways to store comments and nested comments

Updated: January 25, 2024 By: Guest Contributor Post a comment

Introduction

Storing and retrieving nested comments are common requirements for many web applications. With PostgreSQL, there are several methods to model hierarchical or nested data. This guide will explore various solutions, and describe their implementation, along with the trade-offs involved.

Approach #1 – Adjacency List

The adjacency list model represents each comment as a row in a table, containing a reference to its parent comment. It’s simple and widely used.

  1. Create a table with a self-referencing foreign key.
  2. Use recursive queries for retrieval.
  3. Add indexes on the parent comment ID for optimization.

Example:

-- Code to create comments table
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    parent_id INT REFERENCES comments(id),
    content TEXT
);

-- Sample insert
INSERT INTO comments(parent_id, content) VALUES
(1, 'This is a nested comment.');

Notes:

  • Pros: Simplicity, easy inserts, and updates.
  • Cons: Recursive queries can be expensive, not the best read performance for deep hierarchies.
  • Performance: Write queries are fast, read queries may slow down as the hierarchy depth increases. Indexing helps read performance.

Approach #2 – Materialized Path

This approach encodes the path to each comment in a single column, using a predefined delimiter.

  1. Create a table with a ‘path’ column to store the concatenated IDs of parent comments.
  2. Use LIKE queries to find children comments.
  3. Add an index on the path column for faster searches.

Example:

-- Code to create a materialized path table
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    path TEXT,
    content TEXT
);

-- Sample insert with path value
INSERT INTO comments(path, content) VALUES
('1.3.5', 'This is a nested comment at level 3.');

Notes:

  • Pros: Retrieval of full comment trees is efficient.
  • Cons: Path update is costly if hierarchy changes, delimiter must be chosen carefully.
  • Performance: Read performance is good; updating the path to reflect hierarchy changes is complex.

Approach #3 – Nested Set Model

In this model, each comment is given a left and right value that represents its place in a nested set hierarchy.

  1. Create a table with ‘left’ and ‘right’ columns.
  2. Insert or update comments requires re-calculating the left and right values for all affected nodes.
  3. Use left/right values to query the subtree.

Example:

-- Code to create table with nested set columns
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    lft INT,
    rgt INT,
    content TEXT
);

-- Sample nested set calculation and inserts are complex and omitted for brevity.

Notes:

  • Pros: Retrieval is efficient, even for complex trees.
  • Cons: Inserts and deletes are complex, involving updates to multiple rows.
  • Performance: Reading is very fast, writes can be slow and complex.

Approach #4 – Closure Table

A closure table stores every path in the comment tree as a separate record, providing efficient and flexible querying.

  1. Create a separate table to store ancestor and descendant relations among comments.
  2. On insert, add paths to the closure table for the new comment.
  3. Use JOIN queries to retrieve comment trees.

Example:

-- Code to create comments and closure table
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    content TEXT
);

CREATE TABLE comments_closure (
    ancestor_id INT REFERENCES comments(id),
    descendant_id INT REFERENCES comments(id)
);

-- Sample inserts
WITH new_comment AS (
    INSERT INTO comments(content) VALUES('This is a new comment') RETURNING id
)
INSERT INTO comments_closure(ancestor_id, descendant_id)
SELECT ancestor_id, id FROM new_comment, comments_closure
WHERE descendant_id = parent_id;

Notes:

  • Pros: Great read performance, flexible queries.
  • Cons: Extra storage for closure table, more complex writes.
  • Performance: Excellent read performance, writes are more involved but manageable.

Conclusion

Choosing a method for storing nested comments in PostgreSQL depends on the application’s requirements for read and write operations, the expected depth of the comment trees, and the complexity the developers are willing to manage. For frequent reads and infrequent writes, nested set and closure table models offer efficient querying. For applications with simpler nested comment structures, the adjacency list and materialized path are speedy and straightforward to implement. Understanding the trade-offs of each strategy will guide in selecting the most suitable solution.