Hi everybody!
Once upon a time I compress hypertables. And it was the end… of the lightning fast select of 2+ id values which corresponds to PK. The end of story )
Question: how should I write queries using condition with id IN (123, 456[, ...])
so that they work out quickly?
Example:
-- DROP TABLE hypertable;
CREATE TABLE hypertable(id int, type_id int, ts timestamp, PRIMARY KEY(id, ts));
SELECT create_hypertable(relation => 'hypertable', dimension => by_range('ts', INTERVAL '1 day'));
ALTER TABLE hypertable
SET (
timescaledb.compress,
timescaledb.compress_orderby = 'id',
timescaledb.compress_segmentby = 'type_id'
);
DO $$
BEGIN
FOR i IN 0..4 LOOP
INSERT INTO hypertable(id, type_id, ts)
SELECT gs.id, floor(random() * 3 + 1), TIMESTAMP '2023-12-31' + make_interval(days => (i + 1))
FROM generate_series(i * 110000 + 1, 110000 * (i + 1), 1) AS gs(id);
END LOOP;
END;$$;
SELECT compress_chunk(c.chunk_schema || '.' || c.chunk_name)
FROM timescaledb_information.chunks AS c
WHERE hypertable_name = 'hypertable';
EXPLAIN ANALYZE VERBOSE
SELECT * FROM hypertable WHERE id IN (3, 330050);
EXPLAIN ANALYZE VERBOSE
SELECT * FROM hypertable WHERE id = 330050;
Approprate query plans
1.
QUERY PLAN |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Append (cost=0.02..1790.96 rows=3465000 width=16) (actual time=0.026..3.636 rows=2 loops=1) |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40701_chunk (cost=0.02..11.69 rows=693000 width=16) (actual time=0.026..0.716 rows=1 loops=1) |
Output: _hyper_535_40701_chunk.id, _hyper_535_40701_chunk.type_id, _hyper_535_40701_chunk.ts |
Vectorized Filter: (_hyper_535_40701_chunk.id = ANY ('{3,330050}'::integer[])) |
Rows Removed by Filter: 109999 |
Batches Removed by Filter: 110 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40711_chunk (cost=0.00..11.69 rows=693 width=96) (actual time=0.006..0.048 rows=111 loops=1) |
Output: compress_hyper_536_40711_chunk.id, compress_hyper_536_40711_chunk.type_id, compress_hyper_536_40711_chunk.ts, compress_hyper_536_40711_chunk._ts_meta_count, compress_hyper_536_40711_chunk._ts_meta_sequence_num, compress_hyper_536_407|
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40702_chunk (cost=0.02..11.69 rows=693000 width=16) (actual time=0.761..0.761 rows=0 loops=1) |
Output: _hyper_535_40702_chunk.id, _hyper_535_40702_chunk.type_id, _hyper_535_40702_chunk.ts |
Vectorized Filter: (_hyper_535_40702_chunk.id = ANY ('{3,330050}'::integer[])) |
Rows Removed by Filter: 110000 |
Batches Removed by Filter: 111 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40712_chunk (cost=0.00..11.69 rows=693 width=96) (actual time=0.003..0.026 rows=111 loops=1) |
Output: compress_hyper_536_40712_chunk.id, compress_hyper_536_40712_chunk.type_id, compress_hyper_536_40712_chunk.ts, compress_hyper_536_40712_chunk._ts_meta_count, compress_hyper_536_40712_chunk._ts_meta_sequence_num, compress_hyper_536_407|
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40703_chunk (cost=0.02..11.69 rows=693000 width=16) (actual time=0.761..0.761 rows=0 loops=1) |
Output: _hyper_535_40703_chunk.id, _hyper_535_40703_chunk.type_id, _hyper_535_40703_chunk.ts |
Vectorized Filter: (_hyper_535_40703_chunk.id = ANY ('{3,330050}'::integer[])) |
Rows Removed by Filter: 110000 |
Batches Removed by Filter: 111 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40713_chunk (cost=0.00..11.69 rows=693 width=96) (actual time=0.003..0.027 rows=111 loops=1) |
Output: compress_hyper_536_40713_chunk.id, compress_hyper_536_40713_chunk.type_id, compress_hyper_536_40713_chunk.ts, compress_hyper_536_40713_chunk._ts_meta_count, compress_hyper_536_40713_chunk._ts_meta_sequence_num, compress_hyper_536_407|
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40704_chunk (cost=0.02..11.69 rows=693000 width=16) (actual time=0.016..0.744 rows=1 loops=1) |
Output: _hyper_535_40704_chunk.id, _hyper_535_40704_chunk.type_id, _hyper_535_40704_chunk.ts |
Vectorized Filter: (_hyper_535_40704_chunk.id = ANY ('{3,330050}'::integer[])) |
Rows Removed by Filter: 109999 |
Batches Removed by Filter: 110 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40714_chunk (cost=0.00..11.69 rows=693 width=96) (actual time=0.004..0.031 rows=111 loops=1) |
Output: compress_hyper_536_40714_chunk.id, compress_hyper_536_40714_chunk.type_id, compress_hyper_536_40714_chunk.ts, compress_hyper_536_40714_chunk._ts_meta_count, compress_hyper_536_40714_chunk._ts_meta_sequence_num, compress_hyper_536_407|
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40705_chunk (cost=0.02..11.69 rows=693000 width=16) (actual time=0.650..0.650 rows=0 loops=1) |
Output: _hyper_535_40705_chunk.id, _hyper_535_40705_chunk.type_id, _hyper_535_40705_chunk.ts |
Vectorized Filter: (_hyper_535_40705_chunk.id = ANY ('{3,330050}'::integer[])) |
Rows Removed by Filter: 110000 |
Batches Removed by Filter: 111 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40718_chunk (cost=0.00..11.69 rows=693 width=96) (actual time=0.007..0.028 rows=111 loops=1) |
Output: compress_hyper_536_40718_chunk.id, compress_hyper_536_40718_chunk.type_id, compress_hyper_536_40718_chunk.ts, compress_hyper_536_40718_chunk._ts_meta_count, compress_hyper_536_40718_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Query Identifier: -7444174897097278007 |
Planning Time: 0.874 ms |
Execution Time: 3.682 ms |
QUERY PLAN |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Append (cost=0.16..254.43 rows=385000 width=16) (actual time=0.174..0.274 rows=1 loops=1) |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40701_chunk (cost=0.16..12.39 rows=77000 width=16) (actual time=0.057..0.057 rows=0 loops=1) |
Output: _hyper_535_40701_chunk.id, _hyper_535_40701_chunk.type_id, _hyper_535_40701_chunk.ts |
Vectorized Filter: (_hyper_535_40701_chunk.id = 330050) |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40711_chunk (cost=0.00..12.39 rows=77 width=96) (actual time=0.056..0.056 rows=0 loops=1) |
Output: compress_hyper_536_40711_chunk.id, compress_hyper_536_40711_chunk.type_id, compress_hyper_536_40711_chunk.ts, compress_hyper_536_40711_chunk._ts_meta_count, compress_hyper_536_40711_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Filter: ((compress_hyper_536_40711_chunk._ts_meta_min_1 <= 330050) AND (compress_hyper_536_40711_chunk._ts_meta_max_1 >= 330050)) |
Rows Removed by Filter: 111 |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40702_chunk (cost=0.16..12.39 rows=77000 width=16) (actual time=0.048..0.048 rows=0 loops=1) |
Output: _hyper_535_40702_chunk.id, _hyper_535_40702_chunk.type_id, _hyper_535_40702_chunk.ts |
Vectorized Filter: (_hyper_535_40702_chunk.id = 330050) |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40712_chunk (cost=0.00..12.39 rows=77 width=96) (actual time=0.048..0.048 rows=0 loops=1) |
Output: compress_hyper_536_40712_chunk.id, compress_hyper_536_40712_chunk.type_id, compress_hyper_536_40712_chunk.ts, compress_hyper_536_40712_chunk._ts_meta_count, compress_hyper_536_40712_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Filter: ((compress_hyper_536_40712_chunk._ts_meta_min_1 <= 330050) AND (compress_hyper_536_40712_chunk._ts_meta_max_1 >= 330050)) |
Rows Removed by Filter: 111 |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40703_chunk (cost=0.16..12.39 rows=77000 width=16) (actual time=0.049..0.049 rows=0 loops=1) |
Output: _hyper_535_40703_chunk.id, _hyper_535_40703_chunk.type_id, _hyper_535_40703_chunk.ts |
Vectorized Filter: (_hyper_535_40703_chunk.id = 330050) |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40713_chunk (cost=0.00..12.39 rows=77 width=96) (actual time=0.049..0.049 rows=0 loops=1) |
Output: compress_hyper_536_40713_chunk.id, compress_hyper_536_40713_chunk.type_id, compress_hyper_536_40713_chunk.ts, compress_hyper_536_40713_chunk._ts_meta_count, compress_hyper_536_40713_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Filter: ((compress_hyper_536_40713_chunk._ts_meta_min_1 <= 330050) AND (compress_hyper_536_40713_chunk._ts_meta_max_1 >= 330050)) |
Rows Removed by Filter: 111 |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40704_chunk (cost=0.16..12.39 rows=77000 width=16) (actual time=0.020..0.071 rows=1 loops=1) |
Output: _hyper_535_40704_chunk.id, _hyper_535_40704_chunk.type_id, _hyper_535_40704_chunk.ts |
Vectorized Filter: (_hyper_535_40704_chunk.id = 330050) |
Rows Removed by Filter: 2999 |
Batches Removed by Filter: 2 |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40714_chunk (cost=0.00..12.39 rows=77 width=96) (actual time=0.004..0.043 rows=3 loops=1) |
Output: compress_hyper_536_40714_chunk.id, compress_hyper_536_40714_chunk.type_id, compress_hyper_536_40714_chunk.ts, compress_hyper_536_40714_chunk._ts_meta_count, compress_hyper_536_40714_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Filter: ((compress_hyper_536_40714_chunk._ts_meta_min_1 <= 330050) AND (compress_hyper_536_40714_chunk._ts_meta_max_1 >= 330050)) |
Rows Removed by Filter: 108 |
-> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_535_40705_chunk (cost=0.16..12.39 rows=77000 width=16) (actual time=0.047..0.047 rows=0 loops=1) |
Output: _hyper_535_40705_chunk.id, _hyper_535_40705_chunk.type_id, _hyper_535_40705_chunk.ts |
Vectorized Filter: (_hyper_535_40705_chunk.id = 330050) |
Bulk Decompression: true |
-> Seq Scan on _timescaledb_internal.compress_hyper_536_40718_chunk (cost=0.00..12.39 rows=77 width=96) (actual time=0.047..0.047 rows=0 loops=1) |
Output: compress_hyper_536_40718_chunk.id, compress_hyper_536_40718_chunk.type_id, compress_hyper_536_40718_chunk.ts, compress_hyper_536_40718_chunk._ts_meta_count, compress_hyper_536_40718_chunk._ts_meta_sequence_num, compress_hyper_536_407|
Filter: ((compress_hyper_536_40718_chunk._ts_meta_min_1 <= 330050) AND (compress_hyper_536_40718_chunk._ts_meta_max_1 >= 330050)) |
Rows Removed by Filter: 111 |
Query Identifier: 6492221695508221030 |
Planning Time: 0.672 ms |
Execution Time: 0.328 ms |
Both plans make sequence scan. Is there a way to optimize such plans?
This trouble stops me from compressing of few hypertables.
Example above is simple, but we can imagine some other query where join our compressed hypertable by id:
-- This query will run long time...
SELECT some_id, created_at,
(SELECT some_col FROM hypertable AS h WHERE h.id = t1.id) AS some_col
FROM some_table_or_hypertable AS t1
WHERE <condition for t1>;
-- or
SELECT some_id, created_at, t1.some_col
FROM some_table_or_hypertable AS t1
LEFT/FULL/... JOIN hypertable AS h ON h.id = t1.id
WHERE <condition for t1>;
On big hypertables such queries are very long. And when I decompressed it then queries again run fast.