Top 8 PostgreSQL Extensions
PostgreSQL extensions are one of the best things about PostgreSQL. Extensions add extra functionality to your database by modifying and enhancing the execution of certain processes, helping you bypass limitations—such as maintaining good performance when your tables grow up to billions of rows—without the hassle of switching to a whole new database.
PostgreSQL extensions loaded into the database can function just like built-in features, and a rich ecosystem of extensions for all use cases has sprung up over the years. In this article, we’ll teach you how to install and list available extensions and share eight PostgreSQL extensions you should know about.
How to Install Extensions in PostgreSQL
Before getting into our top extensions, let's first see how to install extensions in PostgreSQL. Extensions provide additional functionalities to PostgreSQL, such as data types, functions, and operators. Here are the general steps to install extensions in PostgreSQL:
1. Connect to the PostgreSQL database
First, you need to connect to the PostgreSQL database where you want to install the extension. You can do this using the psql
command-line tool or any other PostgreSQL client (if you don't have psql installed, here's how to do it).
Using psql
, you can connect with:
psql -U username -d database_name
Replace username
with your PostgreSQL username and database_name
with the name of your database.
2. Check for available extensions
You can list all available extensions by running the following SQL command:
SELECT * FROM pg_available_extensions;
3. Install the extension
To install an extension, use the CREATE EXTENSION
command. For example, to install the hstore
extension, run:
CREATE EXTENSION hstore;
4. Verify the installation
After installing, you can verify that the extension has been installed correctly by listing all installed extensions:
SELECT * FROM pg_extension;
Example: Installing the hstore
extension
Here’s a step-by-step example of installing the hstore
extension:
Verify the installation:
SELECT * FROM pg_extension WHERE extname = 'hstore';
Install the hstore
extension:
CREATE EXTENSION hstore;
List available extensions to confirm that hstore
is available:
SELECT * FROM pg_available_extensions WHERE name = 'hstore';
Connect to the PostgreSQL database:
psql -U postgres -d mydatabase
Notes
- You need superuser privileges to create extensions.
- Some extensions may require additional setup or configuration.
- The extension files must be installed on the server. Most common extensions come bundled with PostgreSQL, but some may need to be downloaded and installed separately.
How to List Extensions in PostgreSQL
Now, to list the installed extensions, use the \dx
command in the psql command-line interface. When you enter this command, PostgreSQL will display a list of all the extensions currently installed in your database. It’ll look something like this:
List of installed extensions
Name | Version | Schema | Description
--------------+---------+------------+---------------------------------------------------------------------
pg_trgm | 1.4 | public | text similarity measurement and index searching based on trigrams
postgis | 3.0.1 | public | support for geographic objects
timescaledb | 2.0.0 | public | enables time-series queries
hstore | 1.6 | public | data type for storing sets of (key, value) pairs
uuid-ossp | 1.1 | public | generate universally unique identifiers (UUIDs)
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
- The
name
column shows the extension name, which you can use to install it. - The
version
column tells you the current version of the extension. - The
schema
column shows the default schema where the extension's objects will be created. - The
description
briefly overviews what the extension does.
The underlying SQL query for the \dx
command is fairly simple. To get the same results you can run this:
SELECT * FROM pg_extension;
This will give you a list of the installed extensions into the current database, along with details such as the extension's name and the version.
Remember that these are only the installed extensions and don’t include the extensions available from your PostgreSQL contrib
directory that have not been installed. To see which extensions are available, you can run:
SELECT * FROM pg_available_extensions;
Finally, for bonus points, some services like Timescale use an extension whitelist system which is an installable subset of the available extensions list. To see which extensions are available to be installed when your service uses the PostgreSQL Extension Whitelist, you can run:
SELECT pae.* FROM current_setting('extwlist.extensions') AS cs(e) CROSS JOIN regexp_split_to_table(e, ',') AS ext(allowed) JOIN pg_available_extensions AS pae ON (allowed=name) ORDER BY 1;
The current user can install this list of extensions. By contrast, some of the available extensions list might not be available for installation.
Top 8 PostgreSQL Extensions You Should Know About
1. PostGIS
postgis
transforms PostgreSQL into a database system that can efficiently handle spatial data. It introduces additional data types such as geometry, geography, raster, and more, along with a suite of functions, operators, and indexing capabilities tailored to these spatial types.
With PostGIS, PostgreSQL gains the ability to perform complex location queries using SQL, effectively turning it into a powerful spatial database management system with high performance, a rich feature set, and robust capabilities.
One of the primary advantages of PostGIS is its ability to process geospatial data at the database layer, which is typically more efficient than processing at the application layer. This efficiency is particularly beneficial for IoT applications that require sophisticated tracking, routing, and location-based services.
To install PostGIS into your PostgreSQL database, run:
CREATE EXTENSION postgis;
PostGIS sample query
To illustrate how to use PostGIS, let’s share a sample query involving geospatial and time-series elements. This query is an excellent example of how you might combine different Postgres extensions to meet your requirements (in this case, combining PostGIS and TimescaleDB—we’ll talk about TimescaleDB later).
We want to know: "How many taxis picked up passengers within 400 meters of Times Square on New Year's Day 2016?"
-- How many taxis picked up rides within 400 m of Times Square on New Years Day, grouped by 30-minute buckets?
-- Note: Times Square is at (lat, long) (40.7589,-73.9851)
SELECT time_bucket('30 minutes', pickup_datetime) AS thirty_min, COUNT(*) AS near_times_sq
FROM rides
WHERE ST_Distance(pickup_geom, ST_Transform(ST_SetSRID(ST_MakePoint(-73.9851,40.7589),4326),2163)) < 400
AND pickup_datetime < '2016-01-01 14:00'
GROUP BY thirty_min ORDER BY thirty_min
LIMIT 5;
The above query produces the following results:
-[ RECORD 1 ]-+--------------------
thirty_min | 2016-01-01 00:00:00
near_times_sq | 74
-[ RECORD 2 ]-+--------------------
thirty_min | 2016-01-01 00:30:00
near_times_sq | 102
-[ RECORD 3 ]-+--------------------
thirty_min | 2016-01-01 01:00:00
near_times_sq | 120
-[ RECORD 4 ]-+--------------------
thirty_min | 2016-01-01 01:30:00
near_times_sq | 98
-[ RECORD 5 ]-+--------------------
thirty_min | 2016-01-01 02:00:00
near_times_sq | 112
2. pg_stat_statements
pg_stat_statements
tracks statistics on the queries executed by a Postgres database. It'll help you debug queries, identify slow queries, and generally give you deeper information about how your queries are running. The statistics gathered by the module are made available via a system view named pg_stat_statements
.
To enable the extension, run:
CREATE EXTENSION pg_stat_statements;
Pg_stat_statements sample query
To demonstrate how to utilize pg_stat_statements, let's look at a sample query that could help analyze database performance by identifying the top I/O-intensive SELECT queries.
We're interested in: "What are the top 5 I/O-intensive SELECT queries?"
SELECT query, calls, total_time, rows, shared_blks_hit, shared_blks_read
FROM pg_stat_statements
WHERE query LIKE 'SELECT%'
ORDER BY shared_blks_read DESC, calls DESC
LIMIT 5;
The output would look like this:
-[ RECORD 1 ]--—+---------------------------------------------------
query | SELECT * FROM customer_data WHERE created_at > $1
calls | 500
total_time | 23000
rows | 500000
shared_blks_hit | 100000
shared_blks_read | 75000
-[ RECORD 2 ]-----+---------------------------------------------------
query | SELECT name, address FROM orders WHERE status = $1
calls | 450
total_time | 15000
rows | 450000
shared_blks_hit | 95000
shared_blks_read | 55000
-[ RECORD 3 ]-----+---------------------------------------------------
query | SELECT COUNT(*) FROM transactions WHERE amount > $1
calls | 300
total_time | 12000
rows | 300000
shared_blks_hit | 85000
shared_blks_read | 50000
-[ RECORD 4 ]-----+---------------------------------------------------
query | SELECT product_id FROM inventory WHERE quantity < $1
calls | 400
total_time | 16000
rows | 400000
shared_blks_hit | 80000
shared_blks_read | 45000
-[ RECORD 5 ]-----+---------------------------------------------------
query | SELECT * FROM user_logs WHERE user_id = $1 AND activity_date > $2
calls | 350
total_time | 17500
rows | 350000
shared_blks_hit | 75000
shared_blks_read | 40000
- The
query
column shows the text of a representative statement. calls
counts how many times the statement was executed.total_time
is the total time spent in the statement in milliseconds.rows
shows the total number of rows retrieved or affected by the statement.shared_blks_hit
indicates the number of shared block cache hits by the statement.shared_blks_read
indicates the number of shared blocks read by the statement, which directly relates to the I/O load.
The shared_blks_hit
and shared_blks_read
fields are crucial for I/O analysis. shared_blks_hit
is the count of times disk blocks were found already cached in memory (hence no I/O was needed), whereas shared_blks_read
is the count of times disk blocks had to be read into memory, which indicates actual I/O operations. High values in shared_blks_read
suggest that these queries are the most I/O intensive, which can be a starting point for performance optimization.
3. pgcrypto
pgcrypto
is an extension that enhances PostgreSQL by providing cryptographic functions right within the database system. This extension offers a variety of functions for hash creation, data encryption, and decryption, allowing you to perform secure cryptographic operations on your data within PostgreSQL.
With pgcrypto, PostgreSQL can create secure encrypted data storage and manage sensitive information like passwords, personal data, or financial details using standard algorithms like DES, 3DES, and AES. It's handy when you want to enforce data encryption at the database level, adding a layer of security to guard against unauthorized data access.
One of the benefits of pgcrypto is that it allows the encryption and hashing of data using SQL queries, which can be more efficient and secure than handling this in the application layer. This efficiency is especially important for systems that require compliance with data security standards and regulations.
To enable pgcrypto in your PostgreSQL database, run:
CREATE EXTENSION pgcrypto;
Once you create the extension, you can use its functions in your SQL queries.
Pgcrypto sample query
Here's how you might use pgcrypto to encrypt and decrypt data. Let's say you want to store encrypted user passwords. First, you would encrypt a password when inserting it into a table:
INSERT INTO users (username, password) VALUES ('john_doe', crypt('my_secure_password', gen_salt('bf')));
In this statement, crypt
is a function provided by pgcrypto that encrypts the password using the Blowfish algorithm, which is indicated by gen_salt('bf')
.
Next, to authenticate a user, you would compare a stored password against one provided during login:
SELECT username FROM users WHERE username = 'john_doe' AND password = crypt('input_password', password);
4. pg_partman
pg_partman
is an extension that simplifies creating and maintaining partitions of your PostgreSQL tables. Partitioning is a key database technique that involves splitting a large table into smaller, more manageable pieces while allowing you to access the data as if it were one table. It's a powerful way to keep your large PostgreSQL tables fast and manageable.
With pg_partman, PostgreSQL can manage partitions based on a variety of criteria such as time, serial IDs, or custom values. It eases the maintenance tasks associated with partitioning, such as creating new partitions in advance and purging old ones. This automation is particularly useful for large, time-series datasets that can grow rapidly.
To install pg_partman in your PostgreSQL database, you run:
CREATE EXTENSION pg_partman;
Pg_partman sample query
Let's consider a scenario where you have a large table of IoT device data that you want to partition by day. Here's how you might set up a daily partition scheme for a table called device_data using pg_partman
:
-- Create a parent table
CREATE TABLE device_data (
time timestamptz NOT NULL,
device_id int NOT NULL,
data jsonb NOT NULL
);
-- Set up pg_partman to manage daily partitions of the device_data table
SELECT partman.create_parent('public.device_data', 'time', 'partman', 'daily');
In this setup, create_parent is a function provided by pg_partman that takes the parent table name and the column to partition on (time), as well as the schema (partman) and the partition interval (daily).
After setting up the partitions, pg_partman will help you manage them—but as mentioned previously, you’d like to check out Timescale’s hypertables for a fully-automated, worry-free partitioning solution.
5. postgres_fdw
The postgres_fdw
module enables you to use a Foreign Data Wrapper to access tables on remote Postgres servers (hence the name "fdw"). A Foreign Data Wrapper lets you create proxies for data stored in other Postgres databases to query them as if they were coming from a table in the current database.
Postgres_fdw allows you to combine data between two Postgres instances.
Here’s a sample use case:
- You have a Postgres instance (A), and you use
postgres_fdw
to access data on the Postgres instance (B), which is a remote instance. - You then run queries combining data from instances A and B at the database rather than the application level.
Get postgres_fdw
by running the below command from your psql command line:
CREATE EXTENSION postgres_fdw IF NOT EXISTS;
Postgres_fdw sample query
Here’s how you create a connection to your foreign server:
CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '123.45.67.8', dbname ‘postgres’, port '5432');
This query connects to a database hosted on IP address 123.45.67.8, with the name postgres
at port 5432
. Now, create a user mapping so that users on your database can access the foreign server:
CREATE USER MAPPING FOR postgres
SERVER myserver
OPTIONS (user 'postgres', password 'password');
Once that’s done, you can import a schema from your foreign database and access any table.
CREATE SCHEMA schema1;
IMPORT FOREIGN SCHEMA public
FROM SERVER myserver
INTO schema1;
Once imported, you can now access tables of the foreign database on your “local” database, like the example below, where we access the metrics table:
SELECT * FROM schema1.metrics
WHERE time < now() - ‘2 days’ :: interval;
Remember that when the schema of the tables on myserver
change, your FDW may stop working, and you’ll need to refresh the FDW by dropping and recreating the schema or reimporting the tables that have changed.
6. pgvector
pgvector
adds support for vector operations in PostgreSQL. It enables PostgreSQL to execute similarity searches, nearest-neighbor searches, and other complex operations on vector data. This functionality can be particularly advantageous for applications like recommendation systems, image retrieval, and natural language processing tasks that rely on vector similarity calculations.
To integrate pgvector
into your PostgreSQL setup, you would run the following SQL command:
CREATE EXTENSION vector;
-- Note the extension name is actually vector, not pgvector
Pgvector sample query
Suppose you have a database of image features extracted using a machine learning model, which are stored as vectors. You want to find the most similar images to a given feature vector. Here's how you might use pgvector
to perform a nearest-neighbor search:
-- Assuming we have a table with image features stored as vectors
-- Table: image_features
-- Columns: id (integer), features (vector)
-- Given a query vector, find the 5 most similar images
SELECT id, features
FROM image_features
ORDER BY features <-> 'query_vector'::vector
LIMIT 5;
In this query, the <-> operator is provided by pgvector and represents the distance operator, which calculates the distance between two vectors. The query_vector is the vector representation of the image features you are searching for. This query orders the results by the distance between the query_vector and the features column, effectively returning the closest matches
7. hstore
hstore
is a key-value store within PostgreSQL, functioning as an extension that allows you to store sets of key/value pairs within a single PostgreSQL data type. It's designed to efficiently handle data that is not rigidly structured and can flexibly accommodate a varying number of attributes without altering the schema of your tables.
One key benefit of using hstore
is the ability to index key/value pairs for faster search and retrieval, making it a good choice for semi-structured data or data with sparse attributes. It supports GIN (Generalized Inverted Index) indexing, which can speed up queries on keys and values within the hstore data.
To use hstore
in your PostgreSQL database, you need to enable the extension by running:
CREATE EXTENSION hstore;
Hstore sample query
Here's an example of how you might use hstore
to store and query product data with varying attributes:
-- Create a table with an hstore column for storing product attributes
CREATE TABLE products (
id serial PRIMARY KEY,
name text NOT NULL,
attributes hstore
);
-- Insert a product with attributes into the table
INSERT INTO products (name, attributes)
VALUES ('Smartphone', 'color => "black", storage => "64GB", battery => "3000mAh"');
-- Query to find products with a specific attribute
SELECT name
FROM products
WHERE attributes @> 'storage => "64GB"';
In this example, the @>
operator is used to query the hstore
column for products with a storage key and a value of "64GB." This kind of query is useful when searching for items based on a subset of their attributes.
8. pgpcre
pgpcre
is a PostgreSQL extension that integrates Perl Compatible Regular Expressions (PCRE) into PostgreSQL. It provides advanced string-matching functionality and is particularly useful when PostgreSQL’s built-in regular expression capabilities are insufficient for complex pattern-matching requirements.
This extension is especially beneficial for applications that require sophisticated text analysis or processing, such as parsing logs, searching text, or validating string formats. One of the advantages of pgpcre over standard PostgreSQL text functions is the additional regular expression features it supports, which are not available in PostgreSQL's native regular expression functions. These include but are not limited to advanced look-ahead and look-behind assertions, backtracking control verbs, and complex character class definitions.
To add pgpcre
to your PostgreSQL database, you would execute:
CREATE EXTENSION pgpcre
pgpcre sample query
For example, if you want to search for email addresses in a column of unstructured text, you might use a PCRE pattern for matching emails as follows:
-- Assuming we have a table named messages with a column named content
-- Table: messages
-- Column: content (text)
-- Use pgpcre to match email addresses within the content
SELECT content, pcre_match('^\S+@\S+$', content) AS email
FROM messages
WHERE pcre_match('^\S+@\S+$', content) IS NOT NULL;
In this query, the pcre_match
function is part of the pgpcre
extension and is used to match and extract email addresses looking things from the text content of each row in the messages table. The regular expression pattern provided as the first argument to pcre_match
represents a simple email address format.
Is TimescaleDB a PostgreSQL Extension?
Yes! TimescaleDB is another top ⭐️ extension to know. It adds a query and ingest performance boost to PostgreSQL via automatic partitioning, query planner enhancements, improved materialized views, columnar compression, and much more, making it suitable for data-intensive applications like time-series data. It also includes libraries to simplify analytics and advanced functionality, such as a job scheduler.
If you're running your PostgreSQL database on your own hardware, you can simply add the TimescaleDB extension. If you prefer to try Timescale in AWS, you can create a free account on the Timescale platform. It only takes a couple of seconds, no credit card required.
TimescaleDB example query
Suppose we have a sensor data table that records temperature readings from various devices every minute. We want to analyze the average temperature per hour for a specific device. Here’s how you might construct such a query using TimescaleDB:
-- Assuming we have a hypertable named sensor_readings with columns time, device_id, and temperature
-- Table: sensor_readings
-- Columns: time (TIMESTAMP WITH TIME ZONE), device_id (INT), temperature (DOUBLE PRECISION)
-- Calculate the average temperature per hour for device with ID 1
SELECT time_bucket('1 hour', time) AS one_hour_bucket,
AVG(temperature) AS avg_temperature
FROM sensor_readings
WHERE device_id = 1
GROUP BY one_hour_bucket
ORDER BY one_hour_bucket;
In this query, you use the time_bucket
function provided by TimescaleDB to map individual readings into one-hour intervals or "buckets." Then, the AVG function calculates the average temperature for each interval. This query would be particularly efficient in TimescaleDB due to its optimized handling of time-series data.
Adding PostgreSQL Extensions to Your Databases in Timescale
We’re huge fans of PostgreSQL’s rich extension ecosystem here at Timescale, so, unsurprisingly, we tried to make access to it as simple as possible from our cloud platform user interface. The Timescale console lists all extensions available on your database service and how to enable them.
Installation instructions
And by simple, we mean really simple. You only need to expand any section for the exact instructions on how to enable it.
Search
Looking for a specific extension? Our extension search searches the extension's name and description to help you find what you’re looking for.
Requests
Can’t find the extension you’re looking for? You can quickly request a new extension from within the console itself. Hit the Request an extension button and fill out the name and description of the extension you’re looking for. Help us expand our very own PostgreSQL extension ecosystem!
Conclusion
PostgreSQL extensions are a great way to bypass PostgreSQL’s limitations and add new functionality to your database. In this post, we listed eight helpful PostgreSQL extensions, sample queries, and installation instructions. To learn more about these and other extensions, head here to see what you can do beyond the Postgres basics.
To make life easier for Timescale users, you can choose and enable several PostgreSQL extensions directly in the Timescale console—read our Docs for a full list of the extensions we support. If you want to learn the many other ways we make developers’ lives easier, check out Timescale, our mature cloud database for a fiercer and faster PostgreSQL. It’s free for 30 days, no credit card required.