In brief:
SELECT queries on different chunks are slow
TimescaleDB version: TimescaleDB 2.12.2 on 2023-10-20
PostgreSQL version: postgresql-15
Other software:
OS: linux
Platform: aws ec2 i4i.4xlarge, disk gp3, iops=16000, throuput=1000 MB/s
Install method:Deb
Environment: Test
Hello,
I’m testing performance on a self-managed instance of timescaleDB on a pretty powerfull machine (aws ec2 i4i.4xlarge, disk gp3, iops=16000, throuput=1000 MB/s).
I had network problem and connection reset by timescale cloud, so now I’m running my EC2 with client and server on the same machine.
My test consists of loading a big dataset manualy and then SELECT some datas to measure the performance.
There is 300 devices (devices are boats, their identifier is imo number), having 300 metrics each, produced each minute. Retention 6 months.
I choose a narrow layout (ts, imo, key, value) there are ~23.7E9 in this single table. (partition by ts, chunk of 7days)
CREATE TABLE raw (
ts TIMESTAMPTZ NOT NULL,
imo INT NOT NULL,
key TEXT NOT NULL,
value REAL NOT NULL --NULL?
);
SELECT create_hypertable('raw', 'ts', chunk_time_interval => INTERVAL '7 days');
-- partitioning disabled: only with low cardinality and multi-node architecture
Read access with an expected response time P99.9 =< 2sec:
- Typical: 1 week, 14 metrics, 1 minute time-base => ~140K points
- Worst case: time periode: 2 months, 21 metrics, 1 minute time-base => ~1.85M points
I’ve tried to use compressed chunk or not, but the performance are slow! (select, on cold start, on different chunks are slow):
select ts, key, value from raw
where
imo ='57' and
key = '22' and
ts >= '2023-02-01T00:00:00' and
ts < '2023-03-01T00:00:00'
order by ts;
--40320 rows: 29s (if compressed: 25s). response time linear by time range
Do you think this is an expected performance?
Can you suggest any improvement in term of modeling, query, chunking, access pattern?
Thank you