pg_trgm
is a PostgreSQL extension that provides functions and operators for determining the similarity of alphanumeric text based on trigram matching. Trigrams are a set of three consecutive characters taken from a string.
The pg_trgm
extension can be used to measure the number of trigrams shared by two strings, which can be used to calculate their level of similarity. This is particularly useful for developers implementing features such as search functionality or data matching in their applications.
To install the pg_trgm
extension, you need to have access to a PostgreSQL database. Follow the steps below:
1. Connect to your PostgreSQL database.
2. Load it into your shared_preloaded_libraries in postgresql.conf, as it will require additional shared memory. If you’re using Timescale, you can simply skip this step, as the extension is already preloaded. You can find available extensions by going to Operations > Extensions
from your service overview, which will also give you installation instructions.
3. Run the following SQL command:
CREATE EXTENSION pg_trgm;
This command will install the pg_trgm
extension in your current database. If you want to install it in a different database, you need to connect to that database first.
Once the pg_trgm
extension is installed, you can use it to compare strings for similarity. Here are some examples of how to use the pg_trgm
functions:
To calculate the similarity between two strings, use the similarity(text, text)
function. This function returns a number between 0 and 1, where 1 means the strings are identical and 0 means they have no trigrams in common.
To find records in a table that are similar to a given string, use the %
operator. For example, the following query returns all records in the products
table where the name
is similar to 'apple':
SELECT * FROM products WHERE name % 'apple';
The pg_trgm
extension can be used in a variety of time-series use cases. For example, it can be used to find similar time-series data based on their labels or metadata. This can be useful in scenarios where you want to find patterns or anomalies in your data.
If you're using Timescale, a time-series cloud database built on top of PostgreSQL, you can use the pg_trgm
extension to enhance your time-series data analysis.
For example, you can use the pg_trgm
functions to find similar time series based on their metadata or to search for specific time series based on their labels. To do this, you would use the pg_trgm
functions in the same way as described in section three of this article, but on your time-series data instead.
Learn how to simplify your time-series analysis with only a few functions.