Sling Academy
Home/PostgreSQL/PostgreSQL with TimescaleDB: Using `pg_prometheus` for Metrics Storage

PostgreSQL with TimescaleDB: Using `pg_prometheus` for Metrics Storage

Last updated: December 21, 2024

PostgreSQL is renowned for its robust relational database capabilities. When paired with extensions like TimescaleDB, it transforms into a powerhouse for time-series data processing. In this article, we will explore how pg_prometheus, an extension of TimescaleDB, can be utilized for efficient metrics storage.

Understanding TimescaleDB

TimescaleDB is an open-source time-series database that runs as a PostgreSQL extension. It is optimized for time-series operations such as data ingest, complex queries, and support for out-of-the-box database functionalities. To get started, make sure you have PostgreSQL and TimescaleDB installed on your server.

sudo apt install postgresql postgresql-contrib
sudo apt install timescaledb-postgresql-12

Introducing pg_prometheus

pg_prometheus is a TimescaleDB extension that provides exporters for your PostgreSQL metrics with the Prometheus platform, specifically designed for high fidelity time-series data. By utilizing pg_prometheus, you can enhance the capability of Prometheus in both data efficiency and storage.

Install the extension with:

CREATE EXTENSION prometheus;

pg_prometheus leverages tables such as prom_data for storing Prometheus metrics effectively. Notably, the combination of TimescaleDB and pg_prometheus optimizes for inflow data, making them the ideal duo for metrics.

Setting Up pg_prometheus

Installing TimescaleDB with PostgreSQL is just the first step. The following example outlines steps to get pg_prometheus up and running:

Step 1: Create a Database

CREATE DATABASE my_metrics;

Step 2: Enable Extensions

\c my_metrics
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
CREATE EXTENSION IF NOT EXISTS prometheus;

Step 3: Create Prometheus Data Table

Create a hypertable for storing the data provided by Prometheus:

SELECT create_hypertable('prom_data', 'time');

Hypertables in TimescaleDB are similar to traditional tables, but they partition data automatically based on time intervals, improving performance in querying and management.

Ingesting Metrics

Now that your environment is ready, you will want to populate your hypertable prom_data with the data. Dedicate a simple Python script to ingest data from Prometheus to TimescaleDB:

import requests
import psycopg2

PROMETHEUS_ENDPOINT = "http://localhost:9090/api/v1/query?query=up"
POSTGRES_CONN = "dbname='my_metrics' user='username' host='localhost' password='password'"

# Fetch data from Prometheus
response = requests.get(PROMETHEUS_ENDPOINT)

if response.status_code == 200:
    data = response.json()

    connection = psycopg2.connect(POSTGRES_CONN)
    cursor = connection.cursor()
    # Iterate over Prometheus data and insert into PostgreSQL
    for result in data['data']['result']:
        time_series = result['metric']
        values = result['value']
        query = """
            INSERT INTO prom_data (time, metric_name, value)
            VALUES (%s, %s, %s);
        """
        cursor.execute(query, (values[0], time_series['__name__'], float(values[1])))
    connection.commit()
    cursor.close()
    connection.close()

Modify the connection string and endpoint to match your setup. This will send HTTP requests to the Prometheus API and store the resulting data in the PostgreSQL database.

Querying and Managing Data

One advantage of pairing TimescaleDB with PostgreSQL and pg_prometheus is the ability to handle queries with standard SQL.

For example, to fetch averaged metrics over the last 24 hours:

SELECT time_bucket('1 hour', time) as hour,
       avg(value)
FROM prom_data
WHERE time > now() - interval '24 hours'
GROUP BY hour
ORDER BY hour;

This query breaks down the data into hourly buckets, aggregating the results, which is efficient and insightful for identifying trends in metrics over any specified period.

Conclusion

Combining PostgreSQL with TimescaleDB and pg_prometheus can be a powerful setup for metrics storage and analysis, scaling efficiently even in high-load scenarios. This integration enables seamless metrics collection and querying, equipping teams with the tools necessary for detailed and real-time insights into their data environments.

Next Article: TimescaleDB: Combining Relational and Time-Series Data in PostgreSQL

Previous Article: PostgreSQL with TimescaleDB: Automating Data Aggregation Pipelines

Series: PostgreSQL Tutorials: From Basic to Advanced

PostgreSQL

You May Also Like

  • PostgreSQL with TimescaleDB: Querying Time-Series Data with SQL
  • PostgreSQL Full-Text Search with Boolean Operators
  • Filtering Stop Words in PostgreSQL Full-Text Search
  • PostgreSQL command-line cheat sheet
  • How to Perform Efficient Rolling Aggregations with TimescaleDB
  • PostgreSQL with TimescaleDB: Migrating from Traditional Relational Models
  • Best Practices for Maintaining PostgreSQL and TimescaleDB Databases
  • PostgreSQL with TimescaleDB: Building a High-Performance Analytics Engine
  • Integrating PostgreSQL and TimescaleDB with Machine Learning Models
  • PostgreSQL with TimescaleDB: Implementing Temporal Data Analysis
  • Combining PostgreSQL, TimescaleDB, and Airflow for Data Workflows
  • PostgreSQL with TimescaleDB: Visualizing Real-Time Data with Superset
  • Using PostgreSQL with TimescaleDB for Energy Consumption Analysis
  • PostgreSQL with TimescaleDB: How to Query Massive Datasets Efficiently
  • Best Practices for Writing Time-Series Queries in PostgreSQL with TimescaleDB
  • PostgreSQL with TimescaleDB: Implementing Batch Data Processing
  • Using PostgreSQL with TimescaleDB for Network Traffic Analysis
  • PostgreSQL with TimescaleDB: Troubleshooting Common Performance Issues
  • Building an IoT Data Pipeline with PostgreSQL and TimescaleDB