Hi All,
I am using timescale db at my company for storing market tick data. The frequency of incoming tick data is quite high, for instance, since I started the service in prod from 15th May 2023, around 50 million tick data has been stored in the timescale db.
From this data, I have a query to create bucket for ohlc data.
Here is the query:
SELECT extract(epoch from time_bucket('1 min', to_timestamp(time/1000.0) AT TIME ZONE 'UTC')) * 1000 AS bucket,
first(ltp, time) as open,
max(ltp) as high,
min(ltp) as low,
last(ltp, time) as close,
last(volume, time) as volume
FROM ltpTable
WHERE time >= 1684124540000 AND time < 1685008841397 AND instrument = 'HDFC'
GROUP BY bucket
ORDER BY bucket ASC;
The time interval is of just 10 days. And I am seeing response time in order of 2-3 seconds, which is not ideal at all. The ideal expected response time is 50-70 ms.
Please note that there are major gaps in the data, meaning for some days there is no data. Is the gap in the data leading to this high response time. But I don’t see how this could be the reason.
My table has the following columns: [primary key(index), time, symbol, ltp, volume]
I have created two index:
- Composite index on time and symbol
- Covering index on time, symbol, ltp and volume.
And here is my aws configuration for timescale:
Currently I am using a t3.medium machine.
By default it requests: of 0.5 CPU (500 milliCPU) and a memory request of 64Mi (64 megabytes)
I have put a condition for it to scale, if the CPU reaches 70% and memory reaches 80%. But it hasn’t crossed the limits yet.
Is the performance poor because of my db schema or is there some change needed in my machine configuration?
Kindly let me know. Also if anyone else has worked on druid, is druid’s performance better than timescale for this use case?
Thanks.