Sep 17, 2024
PostgreSQL partitioning is a database optimization technique that divides large tables into smaller, more manageable pieces called partitions, enhancing PostgreSQL performance and efficiency by allowing the planner to access only the necessary partitions to satisfy a query, known as partition pruning. This approach is shared by PostgreSQL’s older inheritance-based partitioning, the newer declarative partitioning, and TimescaleDB’s hypertables, which automatically partition your data into smaller partitions or chunks.
However, not all queries are ideally suited for partitioning (we have examples coming up to prove it), and if you can’t filter by the partitioning column(s), you will have to wave goodbye 👋 to any speedups. This will lead to slow queries since PostgreSQL can’t exclude any partitions without the metadata of the non-partitioned columns.
To address this, as of TimescaleDB 2.16.0, we are enabling developers to use metadata to dynamically prune and exclude partitions (we call them chunks) during planning or execution. You can define chunk-skipping indexes on compressed hypertables for columns of smallint
, int
, bigint
, serial
, bigserial
, date
, timestamp
, and timestamptz
data types. Setting chunk-skipping indexes results in 7x faster performance with an 87 % smaller storage footprint, thanks to compression.
Chunk-skipping indexes are only available with hypertables. If you’re running vanilla PostgreSQL, you’ll be stuck scanning all your partitions and wondering what life could have been like (luckily, TimescaleDB can be installed into any self-managed Postgres, and getting started for free with Timescale Cloud is only a click away).
Keep reading to learn more about what led us to solve this limitation and how you can use this feature in TimescaleDB 2.16.0. To check this week’s previous launches and keep track of upcoming ones, head to this blog post or our launch page. For now, let’s start with some PostgreSQL partitioning basics.
If your PostgreSQL performance is slowing down as tables keep growing and growing, you have probably considered PostgreSQL partitioning. This data organization provides several benefits:
To enhance query performance and simplify data partitioning in PostgreSQL, Timescale developed hypertables, which are PostgreSQL tables that automatically partition data by time using inheritance-based partitioning. Users can specify ranges (e.g., month, week, or any arbitrary range), and TimescaleDB will create partitions (chunks) accordingly.
Large datasets typically involve numerous chunks in the hierarchy. For efficiency, the PostgreSQL planner should optimally select the necessary chunks for each query, making partition pruning (chunk exclusion) crucial for performance.
For example, a one-terabyte table split into 1,000 partitions, each with approximately one gigabyte, would require a query to access all 1,000 partitions without pruning, even if only one partition is needed. Accessing a few partitions instead of 1,000 significantly improves query performance, highlighting the importance of partition pruning.
PostgreSQL uses partition pruning and check constraint exclusion. It assesses partition boundaries and checks constraints to exclude partitions that cannot satisfy a query's WHERE
clause. This method works effectively when queries use the columns designated for partitioning (a.k.a. partition keys, for hypertables almost always including a timestamp) in WHERE
clauses.
Let’s illustrate how partition pruning and chunk exclusion work using a hypertable example (although PostgreSQL partitioning works in a similar way).
Let’s say you have a hypertable with daily partitions, each day of the year being a separate partition.
SELECT create_hypertable('conditions', by_range('time', INTERVAL '1 day'));
Queries filtering on time are easily optimized to access only the relevant partitions or chunks. For example, a query for August 7th will access only that chunk, a query for the week of August 12th will access the chunks for those seven days, and a query for the last 24 hours will access the latest two daily chunks.
SELECT COUNT(*) FROM conditions
WHERE time > NOW() - INTERVAL '24 hours';
However, many scenarios involve queries using secondary columns in WHERE
clauses, not the partitioning column(s) that define(s) the range for each partition. For example, a satellite company might have two time
columns: one for when the observation was made (on the satellite) and one for when the data was added to the database (on the base station).
These columns are correlated, with the base station lagging observation by 30 minutes to 24 hours (rotation around the Earth). If you partition by observation, querying by “when was data added” requires accessing all chunks.
Other examples include event workloads with "start" and "stop" in each row to record a period when a condition was true, or a jobs table partitioned by a serial jobs_id
column with additional job_created
and job_ended
timestamps. This time-stamped information can answer queries seeking the number of jobs created in a specific week or month. Once more, querying by those additional non-partitioning columns results in much slower queries as the PostgreSQL planner can’t prune any partitions.
A final example that may resonate with many engineers involves MVC frameworks and building API endpoints over large datasets. Imagine a table with a sequential ID also used on your API to access or modify entities. That table then grows beyond 100 GB, requiring partitioning.
But now you have this massive challenge: most queries and dashboards use date ranges, while the API accesses data by ID. Partitioning by either ID or time means only half of the queries benefit from partition pruning, while the others require scanning all partitions (the 100 GB+ of data, soon to become TBs as your dataset grows). This dilemma is the reason why I’ve said so many times throughout my career (Hi, Yannis here 👋) that “partitioning is not a silver bullet.”
Not all queries are created to fit a partitioning design choice perfectly. Partitioning introduces a trade-off, as many real-world use cases don’t allow rewriting all existing queries to filter by the partitioning column. As explained earlier, when you can’t filter by the partitioning column(s), you lose the benefits of partition pruning, such as improved PostgreSQL query performance.
When queries include secondary columns, PostgreSQL cannot exclude partitions since it lacks metadata on these columns for chunk pruning. This results in slower, inefficient queries that scan all hypertable chunks.
Often, secondary columns are related to the partitioning column in some way; they are correlated. To exemplify this correlation, let’s think of our earlier example of ending a job. A job’s end time follows its creation time, typically by seconds or minutes. Similarly, a satellite observation’s recording time on the database follows the actual observation time. Sequential IDs and the creation timestamp for the same entities also correlate (both increasing synchronously).
Fun side note: due to this correlation, we actually thought of calling the feature “correlated constraints,” but nobody understood what it was. 😿 Naming in programming remains an elusive art form.
And this is how we began thinking about chunk-skipping indexes.
To fix the problem of “correlated constraints,” as of TimescaleDB 2.16.0, users can set chunk-skipping indexes. These indexes can be defined on columnar sections of hypertables for smallint
, int
, bigint
, serial
, bigserial
, date
, timestamp
, and timestamptz
columns.
SELECT enable_chunk_skipping('conditions', 'device_id');
After enabling chunk skipping on a column, TimescaleDB tracks the minimum and maximum values for that column in each chunk, excluding chunks where queries would find no relevant data.
You can add as many chunk-skipping indexes on additional columns as you need for your use case and workload with minimal storage overhead. The best practice is to enable these indexes on columns correlated to the partitioning column and referenced in WHERE
clauses.
Minimum/maximum ranges are calculated when compressing a chunk in this hypertable with the compress_chunk
function, assuming the chunk is ideally closed for changes. This will allow us to calculate and store min/max limits for the secondary columns in the metadata.
These ranges are stored in the start (inclusive) and end (exclusive) format in the chunk_column_stats
catalog table and used for dynamic chunk pruning when the WHERE
clause of an SQL query specifies ranges on the column. This means the column ranges in the catalog table are stored at the per-chunk level, with no active role in the data partitioning.
Let’s use an example to illustrate how these indexes work. Consider the following schema:
CREATE TABLE orders (
order_id serial,
time timestamptz,
customer_id int,
order_total float
);
SELECT create_hypertable('orders', 'time', chunk_time_interval=>'1 day'::interval);
[ insert 31,536,001 orders, a year's worth of orders ]
# To fetch a single order from the hypertable, you would run a query like this:
SELECT * FROM orders WHERE order_id = 3942785;
QUERY PLAN
---------------------------------------------------------------------------------------------
Gather (cost=1000.00..509743.32 rows=148962 width=24)
Workers Planned: 2
-> Parallel Append (cost=0.00..493847.12 rows=62132 width=24)
-> Parallel Seq Scan on _hyper_4_280_chunk (cost=0.00..1370.21 rows=294 width=24)
Filter: (order_id = 3942785)
-> Parallel Seq Scan on _hyper_4_281_chunk (cost=0.00..1370.21 rows=294 width=24)
Filter: (order_id = 3942785)
… …
# Scanning 365 chunks in total
Time: 2176.563 ms (00:02,177)
Naturally, you wouldn't approach it that way! This example illustrates how long PostgreSQL would take to retrieve all the data if it had to scan every partition. To avoid combing through all the records in each chunk and to speed up your query, you'd create an index on the order_id
column (or better yet, if order_id
was part of the primary key, it would be made automatically).
CREATE INDEX orders_order_id_idx ON orders (order_id)
However, creating this index would require more storage, slowing down ingestion rates. In this case, as order_id
is unique and the example table is narrow, the index increases the total size of orders by 37 %. But it takes the overall execution down to something we can work with:
SELECT * FROM orders WHERE order_id = 3942785;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Append (cost=0.29..3043.28 rows=366 width=24)
-> Index Scan using _hyper_4_213_chunk_orders_order_id_idx on _hyper_4_213_chunk (cost=0.29..8.31 rows=1 width=24)
Index Cond: (order_id = 3942785)
-> Index Scan using _hyper_4_214_chunk_orders_order_id_idx on _hyper_4_214_chunk (cost=0.29..8.31 rows=1 width=24)
Index Cond: (order_id = 3942785)
… … ...
Time: 34.838 ms
Of course, this execution time will only increase as the table grows from a modest 31 million records to billions of records, with many more partition indexes that need to be scanned.
To fix it, let’s add a chunk-skipping index on order_id
and enable compression:
SELECT enable_chunk_skipping('orders', 'order_id');
ALTER TABLE orders SET (timescaledb.compress);
SELECT compress_chunk(show_chunks('orders'));
SELECT * FROM orders where order_id = 3942785;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Custom Scan (DecompressChunk) on _hyper_4_254_chunk (cost=0.15..3.30 rows=22000 width=24)
Vectorized Filter: (order_id = 3942785)
-> Seq Scan on compress_hyper_5_1352_chunk (cost=0.00..3.30 rows=22 width=148)
Filter: ((_ts_meta_v2_min_order_id <= 3942785) AND (_ts_meta_v2_max_order_id >= 3942785))
(4 rows)
Time: 5.064 ms
Defining a chunk-skipping index takes query performance to a whole new level—faster than anything we could achieve before with PostgreSQL (even with that index). Plus, it comes with considerable storage savings! 💰In this case, 7x better performance while using 87 % less storage, thanks to compression. The more chunks your hypertable has, the greater the performance boost you’ll experience. We could have run the same example with more chunks to bloat our figures, but we felt that having 365 partitions was quite representative of most use cases we see in Timescale Cloud.
Chunk-skipping indexes allow us to use sparse indexes at the partition level to supercharge partition pruning, significantly speeding up queries that need to filter by non-partitioning columns.
But we also use sparse indexes behind the scenes at a much more granular level: inside compressed chunks, we segment and group data in small batches of up to 1,000 records, and automatically create similar sparse indexes for each batch. This adds an extra layer of pruning, this time at the batch level, making queries inside chunks (partitions) even faster.
Stay tuned for the follow-up blog post, where we’ll discuss those sparse indexes and the query optimizations you can achieve by leveraging them!
PostgreSQL partitioning can be a powerful optimization tool, but it may only work for some of your queries. With the introduction of chunk-skipping indexes in TimescaleDB 2.16.0, we’ve found a way to overcome the common limitations of partition pruning in PostgreSQL, especially for queries involving non-partitioning columns.
This new feature is part of hyperstore and opens up new possibilities for high-performance query execution in PostgreSQL without compromising storage efficiency. But there’s more coming up; stay tuned for upcoming optimizations! And if you want to start speeding up your PostgreSQL queries today, try Timescale. Sign up for a free account.
Thanks to Nikhil Sontakke for helping us out with this blog post!