Sling Academy
Home/PostgreSQL/PostgreSQL: Full text search with tsvector and tsquery

PostgreSQL: Full text search with tsvector and tsquery

Last updated: January 05, 2024

Introduction

Unleash the power of PostgreSQL’s full-text search capabilities by harnessing tsvector and tsquery, providing advanced, efficient, and versatile search functionality within your database.

What is Full Text Search?

Full text search in PostgreSQL allows you to search through a database by looking for specific words or phrases within text data. It is more sophisticated than simple pattern matching like the LIKE might offer.

Understanding tsvector and tsquery Types

In PostgreSQL, tsvector is a data type that represents a document in the form of lexemes, which are distinct lexical units like words or numbers. On the other hand, tsquery represents a text search query.

Creating a tsvector

SELECT 'A fat cat sat on a mat and ate a fat rat.'::tsvector;

This creates a tsvector of lexemes without any weights or document structure.

Formulating a tsquery

SELECT 'fat & rat'::tsquery;

A tsquery looks for the co-occurrence of both words ‘fat’ and ‘rat’ in the document.

Basic Text Search Operations

Let’s start by creating a basic example.

CREATE TABLE documents (
    id serial PRIMARY KEY,
    content text
);

INSERT INTO documents (content) VALUES
    ('A fat cat sat on a mat and ate a fat rat.'),
    ('No cats or rats were injured.');

SELECT * FROM documents WHERE to_tsvector(content) @@ to_tsquery('fat & rat');

This example demonstrates how to find documents containing both ‘fat’ and ‘rat’.

Indexing for Performance

Full text searches can be boosted with indexes.

CREATE INDEX document_idx ON documents USING gin (to_tsvector('english', content));

This index greatly improves search performance on large datasets.

Weighting and Ranking

We can assign weights to different sections of a document for relevance ranking.

SELECT
    id,
    ts_rank_cd(
        setweight(to_tsvector('english', title), 'A') ||
        setweight(to_tsvector('english', body), 'B'),
        to_tsquery('english', 'PostgreSQL & tsvector')
    ) AS rank
FROM articles
ORDER BY rank DESC;

Here, titles carry more weight than bodies for the search term ‘PostgreSQL & tsvector’.

Phrase Search and Proximity Queries

Phrase search finds exact phrases, and proximity queries are used to find words within a certain distance from each other.

SELECT * FROM documents WHERE to_tsvector('english', content) @@ phraseto_tsquery('english', 'fat rat');

This will fetch documents with the phrase ‘fat rat’.

Advanced Text Search Features

PostgreSQL full text search also supports complex boolean queries and subdocument searches.

SELECT * FROM documents WHERE to_tsvector('english', content) @@ to_tsquery('english', '(fat | feline) & !rat');

This means the content should have either ‘fat’ or ‘feline’ and should not have ‘rat’.

Handling Different Languages

Full text search can handle multiple languages and configurations. To search in Spanish, for instance, you’d initialize the tsvector using the Spanish configuration.

SELECT 'El rápido zorro marrón'::tsvector('spanish');

This vectorizes the text considering the nuances of the Spanish language.

Conclusion

Full text search with tsvector and tsquery offers robust search capabilities in PostgreSQL. Mastering these tools allows for high precision and performance when querying across large text datasets. By implementing proper indexing and understanding the versatile functions available, developers can build complex search functionalities suitable for modern applications.

Next Article: PostgreSQL: Ranking rows with RANK, DENSE_RANK, and ROW_NUMBER

Previous Article: Understanding Table Partitioning in PostgreSQL

Series: PostgreSQL Tutorials: From Basic to Advanced

PostgreSQL

You May Also Like

  • Using cursor-based pagination in PostgreSQL: Step-by-Step Guide
  • PostgreSQL: How to reset the auto-increment value of a column
  • PostgreSQL: How to add a calculated column in SELECT query
  • PostgreSQL: How to Drop FOREIGN KEY Constraints
  • Composite Indexes in PostgreSQL: Explained with Examples
  • Exploring GIN (Generalized Inverted Indexes) in PostgreSQL (with Examples)
  • Custom Collations and Types in PostgreSQL: The Complete Guide
  • Understanding Hash Indexes in PostgreSQL
  • PostgreSQL GiST (Generalized Search Tree) Indexes: Explained with Examples
  • PostgreSQL: Making Use of BRIN (Block Range Indexes)
  • PostgreSQL SP-GiST (Space-Partitioned Generalized Search Tree)
  • Bloom Filters in PostgreSQL: A Practical Guide
  • PostgreSQL: Using Partial Indexes to Improve Efficiency
  • PostgreSQL: Deleting orphan rows in one-to-many relationship
  • PostgreSQL: Implementing fixed-size tables with triggers
  • How to Use Loops in PostgreSQL (with Examples)
  • Working with Temporary Tables in PostgreSQL
  • PostgreSQL: 4 ways to store comments and nested comments
  • PostgreSQL: Saving categories and subcategories in one table