5 Ways to Monitor Your PostgreSQL Database

Importing dashboard from grafana

Written by Semab Tariq

Monitoring your PostgreSQL database is crucial for ensuring its optimal performance and reliability. By monitoring key parameters such as query execution times, resource utilization, and server availability, you can identify potential issues early and take proactive measures to maintain efficiency. 

In this article, we'll delve into monitoring PostgreSQL using built-in tools like pg_stat_activity, pg_stat_statements, and pg_stat_all_tables and Timescale’s own query monitoring tool, Insights. Additionally, we'll explore integrating the PostgreSQL exporter, Prometheus, and Grafana to create a dashboard for monitoring server health.

Want to learn more about how to build your application using PostgreSQL/Timescale? Be sure to check these articles—we’re taking you from data migration to monitoring your database.

PostgreSQL Database Monitoring With pg_stat_statements

pg_stat_statements is a built-in PostgreSQL extension that keeps track of SQL statements executed by the database server. It records details about query execution counts, total execution time, and I/O-related details.

What are long-running queries?

Long-running queries are simply queries that take a significant amount of time to execute, potentially causing performance issues or delays in database operations. These queries may consume excessive resources, such as CPU or memory, and can impact the overall responsiveness of the database system. Identifying and optimizing long-running queries is essential for maintaining the efficiency and reliability of the PostgreSQL database.

How to identify the long-running queries

You can use the following query to extract the top three longest-running queries in your PostgreSQL database:

SELECT   userid :: regrole,   dbid,   mean_exec_time / 1000 as mean_exec_time_secs,   max_exec_time / 1000 as max_exec_time_secs,   min_exec_time / 1000 as min_exec_time_secs,   stddev_exec_time,   calls,   query from   pg_stat_statements order by   mean_exec_time DESC limit 3; userid  | dbid |  mean_exec_time_secs  | max_exec_time_secs | min_exec_time_secs |   stddev_exec_time   | calls |                                                            query                                                            ----------------------------------------------------------------------------------------------------  semab    | 5    | 0.0018055465678913738 |        0.006023764 |        0.001152825 |   0.7903876933367537 |  3756 | SELECT name, setting, COALESCE(unit, $1), short_desc, vartype FROM pg_settings WHERE vartype IN ($2, $3, $4) AND name != $5  postgres | 5    |           0.001178044 |        0.001178044 |        0.001178044 |                    0 |     1 | SELECT pg_stat_statements_reset()  semab    | 5    | 0.0007018039854898849 |        0.001922814 |        0.000597347 | 0.034553571651097015 | 15024 | SELECT pg_database_size($1) (3 rows)

userid :: regrole: This syntax converts the userid column to the regrole data type, representing the executing role (user) of the SQL statement. Thus, it displays the actual name of the user instead of their userid, enhancing readability.

dbid: Represents the database ID where the SQL statement was executed.

mean_exec_time / 1000 as mean_exec_time_secs: it calculates the mean execution time(average execution time) of the SQL statement in seconds.

max_exec_time / 1000 as max_exec_time_secs: it calculates the maximum execution time of the SQL statement in seconds.

min_exec_time / 1000 as min_exec_time_secs: it calculates the minimum execution time of the SQL statement in seconds.

stddev_exec_time: It measures the amount of variation or dispersion in the execution times of the query. A higher stddev_exec_time indicates more variability in the query execution times, while a lower value suggests more consistency. This metric helps assess the stability and predictability of query performance.

calls: it indicates the number of times the SQL statement has been executed.

query: it represents the SQL query statement itself.

ORDER BY: it orders the result set based on the mean_exec_time column in descending order, meaning SQL statements with the longest mean execution times will appear first in the result set.

LIMIT 3: it limits the number of rows returned to three, ensuring only the top three SQL statements with the longest mean execution times are included in the result set.

What are I/O-intensive queries?

I/O-intensive queries are database operations that heavily rely on input/output operations, typically involving frequent reads or writes to disk. These queries often require significant disk access, leading to high disk I/O usage and potential performance bottlenecks.

How to identify I/O-intensive queries?

You can use the following query to identify queries that frequently access disk resources, indicating potential disk-intensive operations:

SELECT    mean_exec_time / 1000 as mean_exec_time_secs,    calls,    rows,    shared_blks_hit,    shared_blks_read,    shared_blks_hit /(shared_blks_hit + shared_blks_read):: NUMERIC * 100 as hit_ratio,    (blk_read_time + blk_write_time)/calls as average_io_time_ms,    query  FROM    pg_stat_statements  where    shared_blks_hit > 0  ORDER BY    (blk_read_time + blk_write_time)/calls DESC;

shared_blks_hit: it retrieves the amount of data read from the shared buffer cache.

shared_blks_read: it retrieves the amount of data read from the disk.

shared_blks_hit /(shared_blks_hit + shared_blks_read):: NUMERIC * 100 as hit_ratio: It calculates the hit ratio, which represents the percentage of shared blocks found in the buffer cache compared to the total number of shared blocks accessed (both from cache and disk). You can calculate it as (shared_blks_hit / (shared_blks_hit + shared_blks_read)) * 100.

(blk_read_time + blk_write_time)/calls as average_io_time_ms: it calculates the average I/O time per call in milliseconds, which is the sum of block read and block write time divided by the number of calls.

WHERE: the filter shared_blks_hit > 0 retrieves the rows where the number of shared blocks in the buffer cache is greater than zero (0), focusing only on statements with at least one shared block hit.

ORDER BY: this filter (blk_read_time + blk_write_time)/calls DESC sorts the result in descending order based on the average I/O time per call.

PostgreSQL Database Monitoring With pg_stat_all_tables

pg_stat_all_tables is a system view in PostgreSQL that provides statistical information about all tables in the current database. It includes various metrics related to table access and usage, such as the number of sequential and index scans performed on each table, the number of tuples inserted, updated, and deleted, as well as information about vacuum and analysis operations.

What is a sequential scan?

A sequential scan refers to the process of scanning all the rows in a table sequentially, usually without using an index. It reads each row one by one from start to finish, which can be less efficient for large tables compared to using an index to access specific rows directly.

How to identify tables with the highest frequency of sequential scans

You can use the following query to retrieve the top three tables with the most sequential scans:

SELECT    schemaname,    relname,    Seq_scan,    idx_scan seq_tup_read,    seq_tup_read / seq_scan as avg_seq_read  FROM    pg_stat_all_tables  WHERE    seq_scan > 0 AND schemaname not in (‘pg_catalog’,’information_schema’)  ORDER BY    Avg_seq_read DESC LIMIT 3;

schemaname: the name of the schema containing the table.

relname: the name of the table.

seq_scan: the number of sequential scans initiated on the table.

idx_scan: the number of index scans initiated on the table.

seq_tup_read: the number of live rows fetched by sequential scans.

seq_tup_read / seq_scan as avg_seq_read: it calculates the average number of rows read per sequential scan.

seq_scan > 0: it selects only tables that have been sequentially scanned at least once.

schemaname not in ('pg_catalog', 'information_schema'): This clause in the SQL query filters out tables from the pg_catalog and information_schema schemas. These schemas contain system tables and views that are automatically created by PostgreSQL and are not typically user-created or user-managed.

ORDER BY: it orders the result set based on the calculated avg_seq_read column in descending order, meaning tables with the highest average sequential read rate will appear first in the result set.

LIMIT 10: it limits the number of rows returned to 10, ensuring only the top 10 tables with the highest average sequential read rates are included in the result set.

What are infrequently accessed tables?

Infrequently accessed tables in a database are those that are not frequently queried or manipulated. These tables typically have low activity and are accessed less frequently compared to other tables in the database. They may contain historical data, archival data, or data that is rarely used in day-to-day operations.

How to identify infrequently accessed tables?

The following query retrieves tables with low total scan counts (both sequential and index scans combined):

SELECT   schemaname,   relname,   seq_scan,   idx_scan,   (COALESCE(seq_scan, 0) + COALESCE(idx_scan, 0)) as total_scans_performed FROM   pg_stat_all_tables WHERE   (COALESCE(seq_scan, 0) + COALESCE(idx_scan, 0)) < 10 AND schemaname not in (‘pg_catalog’, ‘information_schema’) ORDER BY   5 DESC;

seq_scan: this column represents the number of sequential scans performed on the table.

idx_scan: this column represents the number of index scans performed on the table.

(COALESCE(seq_scan, 0) + COALESCE(idx_scan, 0)) as total_scans_performed: This expression calculates the total number of scans performed on the table by summing up the sequential scans and index scans. COALESCE function is used to handle NULL values by replacing them with 0.

(COALESCE(seq_scan, 0) + COALESCE(idx_scan, 0)) < 10: This condition filters the tables based on the total number of scans performed on each table. It selects tables where the total number of scans (sequential scans + index scans) is less than 10.

total_scans_performed DESC: This clause orders the result set by the total number of scans performed on each table in descending order. Tables with the highest total number of scans appear first in the result set.

PostgreSQL Database Monitoring With pg_stat_activity

pg_stat_activity is a system view in PostgreSQL that provides information about the current activity of database connections. It includes one row per server process, showing details such as the username of the connected user, the database being accessed, the state of the connection (idle, active, waiting, etc.), the current query being executed, and more.

Finding long-running queries by time

The following SQL query retrieves information about currently running (active) database sessions in PostgreSQL that have been executing for longer than one minute.

SELECT    datname AS database_name,   usename AS user_name,    application_name,    client_addr AS client_address,    client_hostname,    query AS current_query,    state,    query_start,    now() - query_start AS query_duration  FROM    pg_stat_activity  WHERE    state = 'active' AND now() - query_start > INTERVAL '10 sec'  ORDER BY    query_start DESC;

now() - query_start AS query_duration: it calculates the duration of the query execution by subtracting the start time from the current time.

state = 'active': it filters the results to include only active queries currently running.

now() - query_start > INTERVAL '10 sec': it filters the results to include only queries that have been running for more than 10 seconds.

query_start DESC: it orders the results based on the query start time in descending order, showing the most recently started queries first.

Integrating PostgreSQL With Grafana and Prometheus for Database Monitoring

The PostgreSQL exporter (postgres_exporter) is a specialized tool designed to extract metrics and statistics from PostgreSQL database servers. It collects various performance-related data points and makes them available for monitoring and analysis through monitoring systems like Prometheus. 

An open-source application used for event monitoring and alerting, Prometheus records metrics in a time-series database built using an HTTP pull model with flexible queries and real-time alerting.

Grafana is a versatile open-source analytics and interactive visualization tool accessible across multiple platforms. It enables the creation of charts, graphs, and alerts on the web, offering comprehensive insights when linked with compatible data sources. 

For this integration, we are using an EC2 Ubuntu 22.04 instance where we are first setting up PostgreSQL and Postgres exporter and will then set up Prometheus and Grafana.

Ports information

PostgreSQL: 5432

Postgres exporter: 9100

Prometheus: 9090

Grafana: 3000

Set up PostgreSQL and the postgres_exporter

Install PostgreSQL

Install the latest version of PostgreSQL:

sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt-get update sudo apt-get -y install postgresql

Configure PostgreSQL

Modify the authentication method in the pg_hba.conf file to use scram-sha-256 instead of peer for local connections. This will allow you to use password-based authentication to the server.

Local    all    all    scram-sha-256

Restart the server:

sudo systemctl restart postgresql

Create a database user for monitoring and set a password for this user:

CREATE USER monitoring_user WITH PASSWORD test@1234 SUPERUSER;

Install the postgres_exporter

Download the latest release of the postgres_exporter binary:

wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.14.0/postgres_exporter-0.14.0.linux-amd64.tar.gz

Unzip the binary:

tar xzf postgres_exporter-0.14.0.linux-amd64.tar.gz

Move postgres_exporter binary to /usr/local/bin:

sudo cp postgres_exporter /usr/local/bin

Configure the postgres_exporter

Create a new directory under /opt to store connection information for the PostgreSQL server:

mkdir /opt/postgres_exporter echo DATA_SOURCE_NAME="postgresql://monitoring_user:test@1234@localhost:5432/?sslmode=disable" > /opt/postgres_exporter/postgres_exporter.env

Create a service file for the postgres_exporter: echo '[Unit] Description=Postgres exporter for Prometheus Wants=network-online.target After=network-online.target

[Service] User=postgres Group=postgres WorkingDirectory=/opt/postgres_exporter EnvironmentFile=/opt/postgres_exporter/postgres_exporter.env ExecStart=/usr/local/bin/postgres_exporter --web.listen-address=localhost:9100 --web.telemetry-path=/metrics Restart=always

[Install]| WantedBy=multi-user.target' >> /etc/systemd/system/postgres_exporter.service

Since we created a new service file, it is better to reload the demon once so it recognizes the new file:

sudo systemctl daemon-reload

Start and enable the postgres_exporter service:

sudo systemctl start postgres_exporter sudo systemctl enable postgres_exporter

Check the service status: 

sudo systemctl status postgres_exporter postgres_exporter.service - Prometheus exporter for Postgresql      Loaded: loaded (/etc/systemd/system/postgres_exporter.service; enabled; vendor preset: enabled)      Active: active (running) since Tue 2024-03-05 13:52:56 UTC; 2h 15min ago    Main PID: 9438 (postgres_export)       Tasks: 6 (limit: 9498)

Verify the postgres_exporter setup from the browser:

image

Set up the Prometheus and Grafana server

Install Prometheus

Create a system group named Prometheus:

sudo groupadd --system prometheus

Create a system user named Prometheus in a Prometheus group without an interactive login: 

sudo useradd -s /sbin/nologin --system -g prometheus prometheus

Creating the required directory structure:

sudo mkdir /etc/prometheus sudo mkdir /var/lib/prometheus

Download the Prometheus source:

wget https://github.com/prometheus/prometheus/releases/download/v2.43.0/prometheus-2.43.0.linux-amd64.tar.gz

Decompress the source code:

tar vxf prometheus*.tar.gz

Set up proper permissions for the installation files:

cd prometheus*/ sudo mv prometheus /usr/local/bin sudo mv promtool /usr/local/bin sudo chown prometheus:prometheus /usr/local/bin/prometheus sudo chown prometheus:prometheus /usr/local/bin/promtool sudo mv consoles /etc/prometheus sudo mv console_libraries /etc/prometheus sudo mv prometheus.yml /etc/prometheus sudo chown prometheus:prometheus /etc/prometheus sudo chown -R prometheus:prometheus /etc/prometheus/consoles sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries sudo chown -R prometheus:prometheus /var/lib/prometheus

Configure Prometheus

Add the PostgreSQL Exporter configurations inside the prometheus.yml file at the following location:

 /etc/prometheus/prometheus.yml scrape_configs:   - job_name: "Postgres exporter"     scrape_interval: 5s     static_configs:       - targets: [localhost:9100]

Create a new service file for Prometheus at the following location: /etc/systemd/system/prometheus.service

[Unit] Description=Prometheus Wants=network-online.target After=network-online.target

[Service] User=prometheus Group=prometheus Type=simple

ExecStart=/usr/local/bin/prometheus \     --config.file /etc/prometheus/prometheus.yml \     --storage.tsdb.path /var/lib/prometheus/ \     --web.console.templates=/etc/prometheus/consoles \     --web.console.libraries=/etc/prometheus/console_libraries

[Install] WantedBy=multi-user.target

Reload systemd manager:

sudo systemctl daemon-reload

Start the Prometheus service:

sudo systemctl enable prometheus sudo systemctl start prometheus

Verify the Prometheus service:

sudo systemctl status prometheus prometheus.service - Prometheus      Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled)

     Active: active (running) since Tue 2024-03-05 13:53:51 UTC; 2h 27min ago    Main PID: 9470 (prometheus)       Tasks: 8 (limit: 9498)

Verify the Prometheus setup from the browser:

image

Install Grafana

Install the pre-required packages:

sudo apt install -y apt-transport-https software-properties-common 

Add the Grafana GPG key:

sudo mkdir -p /etc/apt/keyrings/ wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null

Add Grafana’s APT repository:

echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list sudo apt update

Install the Grafana package:

sudo apt install grafana

Start the Grafana service:

sudo systemctl start grafana-server sudo systemctl enable grafana-server

Verify the Grafana service:

sudo systemctl status grafana-server ● grafana-server.service - Grafana instance     Loaded: loaded (/lib/systemd/system/grafana-server.service; enabled; vendor preset: enabled)      Active: active (running) since Tue 2024-03-12 05:45:23 UTC; 1h 29min ago        Docs: http://docs.grafana.org    Main PID: 719 (grafana)

Verify your Grafana setup from the browser:

image

Note: if you see the login screen, log in by specifying the following credentials.

Username: admin

Password: admin

Integrate Prometheus server with Grafana

Add new data source for Prometheus:

image

Configure the data source, click Save, and test.

image

Create a new dashboard

Find a PostgreSQL Grafana dashboard online and copy the URL of the dashboard that suits your preferences. Paste the URL and select Load.

image

Fill out the required fields. Select Prometheus from the data source and click Import.

image

Your dashboard is ready!

image

You can see the following properties straight away: average CPU usage, average memory usage, and a couple of important postgresql.conf properties:

  • Shared buffers

  • Effective cache 

  • Maintenance work memory

  • Work memory, etc.  

If you scroll down, you can see a more detailed version of database stats, including active/idle sessions, cache hit ratio, buffers, etc.

image

PostgreSQL Database Monitoring With Metrics Dashboard and Insights

If you want to save yourself all of this work when it comes to database monitoring, Timescale (built on PostgreSQL) has a metrics dashboard that significantly improves the monitoring capabilities of your database services by providing detailed, service-level insights such as CPU, memory, and query-level statistics. 

It offers support for various time ranges to view metrics, including the last hour, 24 hours, seven days, and 30 days, each with specific granularity. Furthermore, you can continuously monitor the health and resource consumption of your database services, enabling proactive management and optimization of database performance. 

This feature facilitates the identification of trends, diagnosis of issues, and optimization of configurations to maintain optimal service health. It also includes pg_stat_statements by default.

However, we believe that pg_stat_statements has some limitations. So, to overcome them and provide a more in-depth query monitoring experience, you can utilize Insights, a tool we developed last year. Insights empowers you to closely examine the queries executed in your database within a specific timeframe, providing valuable statistics on timing/latency, memory usage, and more. 

Upon identifying specific queries for further investigation, you can access additional information through a drill-down view to better understand the situation, such as whether your database performance has been deteriorating or improving (e.g., due to adding a new index).

image

Insights offers an unmatched level of granularity in database observability, complementing and enhancing the monitoring and optimization capabilities provided by pg_stat_statements for PostgreSQL. This combination enables a comprehensive approach to query optimization and performance monitoring within the Postgres environment.

Another amazing feature of your cloud service compared to running it yourself is the Insights section [in the Timescale UI], which gives you insights into what queries are impacting performance. This is a super valuable tool for us because we are currently monitoring at the application level, trying to measure some things there. However, seeing it on the database level, with the actual number of executions and total rows scanned, is super valuable for optimizing our queries. Since we have a super customizable application where customers can misconfigure some things, this tool is crucial to spot those changes and optimize where needed.”

Lukas Klein, CTO at Datacake

Summing It Up

Effective database monitoring is crucial for maintaining optimal performance and reliability. By leveraging the right tools and techniques, developers can proactively identify and address issues, ensuring uninterrupted operations and maximizing the value of their database investments. 

We walked you through five different ways of monitoring your PostgreSQL database. For easy query monitoring, try Timescale’s Insights: not only will you quickly figure out which queries are slowing down your performance, but you will also be able to further accelerate them with Timescale’s automatic partitioning, columnar compression, and automatically updated materialized views. Try Timescale today—create a free account.