Dear community,
We are using timescale self-hosted inside a docker container. Generally, we are very happy with the performance and features, but we recently stumbled into a problem that I’m unable to track down.
The container is running on an AWS EC2 instance with 2 vCPUs (3,1 GHz) and 4 GB RAM.
So definitely not the strongest machine out there … memory access is not restricted, so all 4 GB of RAM and all CPU resources can be used from the container.
We have a table, lets call it device_state
, containing 200 million rows. We write approx. 2000 rows per minute.
It was created using these statements:
CREATE TABLE device_state (
"time" timestamptz NOT NULL,
id_part_1 uuid NOT NULL,
id_part_2 varchar NOT NULL,
id_part_3 uuid NOT NULL,
... 16 more columns ...
);
SELECT create_hypertable('device_state', 'time', chunk_time_interval => INTERVAL '1 day');
CREATE UNIQUE INDEX ON device_state(id_part_1, id_part_2, id_part_3, time DESC);
CREATE INDEX ON device_state(id_part_1, id_part_2, time DESC);
ALTER TABLE device_state SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'id_part_1, id_part_2'
);
SELECT add_compression_policy('device_state', compress_after => INTERVAL '90d');
So, we have 2 indexes on the hypertable, and compression enabled after 90days. Chunks get created every day. Chunks are around 900MB in size.
Now, queries that go narrow and deep, for example:
SELECT avg(some_column) as avg
FROM device_state
WHERE time < (now() - INTERVAL '5 days')
AND id_part_1 = 'abc'
AND id_part_2 = 'def'
AND id_part_3 = 'ghi'
GROUP BY id_part_1, id_part_2, id_part_3
This takes very long to complete (2 minutes 30 seconds) for 120’000 rows that needed to be found and averaged.
My question is: Why ? How can I check why it takes so long, and how can I fix it ?
I would have expected that the index is used and the query is very fast.
Is my hardware (massively) under-sized for the size of table, or is something else at work here ?
Thank you all in advance!