Nov 28, 2024
PostgreSQL is an amazing database, but it can struggle with certain types of queries, especially as tables approach tens and hundreds of millions of rows (or more). DISTINCT
queries are an example of this.
Why are DISTINCT
queries slow on PostgreSQL when they seem to ask an "easy" question? It turns out that PostgreSQL currently lacks the ability to efficiently pull a list of unique values from an ordered index.
Even when you have an index that matches the exact order and columns for these "last-point" queries, PostgreSQL is still forced to scan the entire index to find all unique values. As a table grows (and they grow quickly with time-series data), this operation keeps getting slower.
Other databases, such as MySQL, Oracle, and DB2, implement a feature called "Loose index scan," "Index Skip Scan," or “Skip Scan,” to speed up the performance of queries like this.
When a database has a feature like "Skip Scan," it can incrementally jump from one ordered value to the next without reading all of the rows in between. Without support for this feature, the database engine has to scan the entire ordered index and then deduplicate it at the end—which is a much slower process.
Since 2018, there have been plans to support something similar in PostgreSQL. (Note: We couldn’t use this implementation directly due to some limitations of what is possible within the Postgres extension framework.)
Unfortunately, this patch wasn't included in the CommitFest for PostgreSQL 14, so it won't be included until PostgreSQL 15 at the earliest (i.e., no sooner than Fall 2022, at least 1.5 years from now).
We don’t want our users to have to wait that long.
Today, via TimescaleDB 2.2.1, we are releasing TimescaleDB SkipScan, a custom query planner node that makes ordered DISTINCT
queries blazing fast in PostgreSQL 🔥.
As you'll see in the benchmarks below, some queries performed more than 8,000x better than before—and many of the SQL queries your applications and analytics tools use could also see dramatic improvements with this new feature.
This feature works in both Timescale hypertables and normal PostgreSQL tables.
This means that with Timescale, not only will your time-series DISTINCT
queries be faster, but any other related queries you may have on normal PostgreSQL tables will also be faster.
This is because Timescale is not just a time-series database. It’s a relational database, specifically, a relational database for time series. Developers who use Timescale benefit from a purpose-built time-series database plus a classic relational (Postgres) database, all in one, with full SQL support.
And to be clear, we love PostgreSQL. We employ engineers who contribute to PostgreSQL. We contribute to the ecosystem around PostgreSQL. PostgreSQL is the world’s fastest-growing database, and we are excited to support it alongside thousands of other users and contributors.
We constantly seek to advance the state of the art with databases, and features like SkipScan are only our latest contribution to the industry. SkipScan makes Timescale and PostgreSQL better, more competitive databases overall, especially compared to MySQL, Oracle, DB2, and others.
If you're new to PostgreSQL and are wondering how to check your query performance in the first place (and optimize it!), we're going to leave two helpful resources here:
EXPLAIN ANALYZE
by Michael Christofides in one of our Timescale Community Days. And here's a blog post on Explaining EXPLAIN in case you're more of a reader.However, if you're an experienced PostgreSQL user, you might point out that it is already possible to get reasonably fast DISTINCT
queries via RECURSIVE CTEs
.
From the PostgreSQL Wiki, using a RECURSIVE CTE
can get you good results, but writing these kinds of queries can often feel cumbersome and unintuitive, especially for developers new to PostgreSQL:
WITH RECURSIVE cte AS (
(SELECT tags_id FROM cpu ORDER BY tags_id, time DESC LIMIT 1)
UNION ALL
SELECT (
SELECT tags_id FROM cpu
WHERE tags_id > t.tags_id
ORDER BY tags_id, time DESC LIMIT 1
)
FROM cte t
WHERE t.tags_id IS NOT NULL
)
SELECT * FROM cte LIMIT 50;
But even if writing a RECURSIVE CTE
like this in day-to-day querying felt natural to you, there's a bigger problem. Most application developers, ORMs, and charting tools like Grafana or Tableau will still use the simpler, straight-forward form:
SELECT DISTINCT ON (tags_id) * FROM cpu
WHERE tags_id >=1
ORDER BY tags_id, time DESC
LIMIT 50;
In PostgreSQL, without a ", such as MySQL, Oracle, and DB2, implement a feature called "Loose index scan," "Index Skip Scan," or “Skip Scan" node, this query will perform the much slower Index Only Scan, causing your applications and graphing tools to feel clunky and slow.
Surely there's a better way, right?
SkipScan is an optimization for queries in the form of SELECT DISTINCT ON
(column). Conceptually, a SkipScan is a regular IndexScan that “skips” across an index looking for the next value that is greater than the current value:
With SkipScan in Timescale/PostgreSQL, query planning and execution can now utilize a new node (displayed as (SkipScan)
in the EXPLAIN
output) to quickly return distinct items from a properly ordered index.
Rather than scanning the entire index with an Index Only Scan, SkipScan incrementally searches for each successive item in the ordered index. As it locates one item, the (SkipScan)
node quickly restarts the search for the next item. This is a much more efficient way of finding distinct items in an ordered index. (See GitHub for more details.)
In every example query, Timescale with SkipScan improved query response times by at least 26x.
But the real surprise is how much of a difference it makes at lower cardinalities with lots of data—it is almost 8,500x faster to retrieve all columns for the most recent reading of each device. That's fast!
In our tests, SkipScan is also consistently faster—by 80x or more—in our 4,000 device benchmarks. (This level of cardinality is typical for many users of Timescale.)
Before we share the full results, here is how our benchmark was set up.
To perform our benchmarks, we installed Timescale on a DigitalOcean Droplet using the following specifications. PostgreSQL and Timescale were installed from packages, and we applied the recommended tuning from timescaledb-tune
.
To demonstrate the performance impact of SkipScan on varying degrees of cardinality, we benchmarked three separate datasets of varying sizes. To generate our datasets, we used the 'cpu-only' use case in the Time Series Benchmark Suite (TSBS), which creates 10 metrics every 10 seconds for each device (identified by the tag_id
in our benchmark queries).
Dataset 1 | Dataset 2 | Dataset 3 |
---|---|---|
100 devices | 4000 devices | 10,000 devices |
4 months of data | 4 days of data | 36 hours of data |
~103,000,000 rows | ~103,000,000 rows | ~144,000,000 rows |
Not all device data is up-to-date in real life because devices go offline and internet connections get interrupted. Therefore, to simulate a more realistic scenario (i.e., that some devices had stopped reporting for a period of time), we deleted rows for random devices over each of the following periods.
Dataset 1 | Dataset 2 | Dataset 3 |
---|---|---|
5 random devices over: | 100 random devices over: | 250 random devices over: |
30 minutes | 1 hour | 10 minutes |
36 hours | 12 hours | 1 hour |
7 days | 36 hours | 12 hours |
1 month | 3 days | 24 hours |
To delete the data, we utilized the tablesample
function of Postgres. This SELECT
feature allows you to return a random sample of rows from a table based on a percentage of the total rows. In the example below, we randomly sample 10% of the rows ( bernoulli(10)
) and then take the first 10 ( limit 10
).
DELETE FROM cpu
WHERE tags_id IN
(SELECT id FROM tags tablesample bernoulli(10) LIMIT 10)
AND time >= now() - INTERVAL '30 minutes';
From there, we ran each benchmarking query multiple times to accommodate for caching, with and without SkipScan enabled.
As mentioned earlier, the following two indexes were present on the hypertable for all queries.
"cpu_tags_id_time_idx" btree (tags_id, "time" DESC)
"cpu_time_idx" btree ("time" DESC)
Here are the results:
For this test, we benchmarked five types of common queries:
SELECT DISTINCT ON (tags_id) tags_id, time FROM cpu
ORDER BY tags_id, time DESC
LIMIT 10 OFFSET 50;
SELECT DISTINCT ON (tags_id) * FROM cpu
ORDER BY tags_id, time DESC
LIMIT 10 OFFSET 50;
SELECT DISTINCT ON (tags_id) * FROM cpu
WHERE time >= now() - INTERVAL '5 minutes'
ORDER BY tags_id, time DESC;
WITH older AS (
SELECT DISTINCT ON (tags_id) tags_id FROM cpu
WHERE time > now() - INTERVAL '24 hours'
)
SELECT * FROM older o
WHERE NOT EXISTS (
SELECT 1 FROM cpu
WHERE cpu.tags_id = o.tags_id
AND time > now() - INTERVAL '1 hour'
);
WITH older AS (
SELECT DISTINCT ON (tags_id) tags_id FROM cpu
WHERE time > now() - INTERVAL '48 hours'
AND time < now() - INTERVAL '24 hours'
)
SELECT * FROM older o
WHERE NOT EXISTS (
SELECT 1 FROM cpu
WHERE cpu.tags_id = o.tags_id
AND time > now() - INTERVAL '24 hour'
);
But SkipScan isn’t a theoretical improvement reserved for benchmarking blog posts 😉—it has real-world implications, and many applications we use rely on getting this data as fast as possible.
Think about the applications you use (or develop) every day. Do they retrieve paged lists of unique items from database tables to fill dropdown options (or grids of data)?
At a few thousand items, the query latency might not be very noticeable. But, as your data grows and you have millions of rows of data and tens of thousands of distinct items, that dropdown menu might take seconds—or minutes—to populate.
SkipScan can reduce that to tens of milliseconds!
Even better, SkipScan also provides a fast, efficient way of answering the question that so many people with time-series data ask every day:
"What was the last time and value recorded for each of my [devices / users / services / crypto and stock investments / etc]?"
As long as there is an index on "device_id" and "time" descending, SkipScan will retrieve the data using a query like this much more efficiently.
SELECT DISTINCT ON (device_id) * FROM cpu
ORDER BY device_id, time DESC;
With SkipScan, your application and dashboards that rely on these types of queries will now load a whole lot faster 🚀 (see below).
How do you get started? Upgrade to TimescaleDB 2.2.1 and set up your schema and indexing as described below. You should start to see immediate speed improvements in many of your DISTINCT
queries.
To ensure that a (SkipScan) node can be chosen for your query plan:
First, the query must use the DISTINCT
keyword on a single column. The benchmarking queries above will give you some examples to draw from.
Second, there must be an index that contains the DISTINCT
column first, and any other ORDER BY
columns. Specifically:
BTREE
index.ORDER BY
in your query.DISTINCT
column must either be the first column of the index, or any leading column(s) must be used as constraints in your query.In practice, this means that if we use the questions from the beginning of this blog post ("retrieve a list of unique IDs in order" and "retrieve the last reading of each ID"), we would need at least one index like this (but if you're using a TimescaleDB hypertable, this likely already exists):
"cpu_tags_id_time_idx" btree (tags_id, "time" DESC)
With that index in place, you should start to see immediate benefit if your queries look similar to the benchmarking examples below. When SkipScan is chosen for your query, the EXPLAIN ANALYZE
output will show one or more Custom Scan (SkipScan)
nodes similar to this:
-> Unique
-> Merge Append
Sort Key: _hyper_8_79_chunk.tags_id, _hyper_8_79_chunk."time" DESC
-> Custom Scan (SkipScan) on _hyper_8_79_chunk
-> Index Only Scan using _hyper_8_79_chunk_cpu_tags_id_time_idx on _hyper_8_79_chunk
Index Cond: (tags_id > NULL::integer)
-> Custom Scan (SkipScan) on _hyper_8_80_chunk
-> Index Only Scan using _hyper_8_80_chunk_cpu_tags_id_time_idx on _hyper_8_80_chunk
Index Cond: (tags_id > NULL::integer)
...
If you’re new to Timescale, create a free account to get started with a fully managed TimescaleDB instance (100 % free for 30 days).
If you are an existing user:
Join our Slack Community to share your results, ask questions, get advice, and connect with other developers (I, as well as our co-founders, engineers, and passionate community members, are active on all channels).
You can also visit our GitHub to learn more (and, as always, ⭐️ are appreciated!)