Hello!
We have approximatly 400 000 of records a day. SELECT works about 1.5 sec for a day:
SELECT
time_bucket('1 day', t."date") AS bucket,
t.col2,
t.col3,
t.col4,
t.col5,
t.col6,
SUM(t.col7) AS col7,
SUM(t.col8) AS col8,
SUM(t.col9) AS col9,
COUNT(1) AS col10,
SUM(t.col11) AS col11,
SUM(t.col11 * t.col7) AS col11_7,
SUM(t.col12) AS col12,
SUM(t.col13) AS col13,
SUM(t.col14) AS col14,
SUM(t.col15) AS col15,
SUM(t.col16) AS col16,
MAX(t.col17) AS col17,
SUM(t.col18) AS col18,
SUM(t.col19) AS col19,
SUM(t.col20) AS col20,
SUM(t.col21) AS col21,
SUM(t.col22) AS col22,
SUM(CASE WHEN t.col9 > 0 THEN 1 ELSE 0 END) AS col23
FROM
hypertable_7days t
WHERE t."date" >= '2022-10-22' AND t."date" < '2022-10-23'
GROUP BY
1,
t.col2,
t.col3,
t.col4,
t.col5,
t.col6;
and its query plan:
Finalize GroupAggregate (cost=22468.79..49364.92 rows=47940 width=585) (actual time=536.616..2491.484 rows=85429 loops=1)
Group Key: (time_bucket('1 day'::interval, "date")), col2, col3, col4, col5, col6
-> Gather Merge (cost=22468.79..45194.14 rows=191760 width=585) (actual time=536.584..1273.953 rows=176014 loops=1)
Workers Planned: 4
Workers Launched: 3
-> Partial GroupAggregate (cost=21468.77..23518.20 rows=47940 width=585) (actual time=487.139..1022.525 rows=44004 loops=4)
Group Key: (api_bet.time_bucket('1 day'::interval, "date")), col2, col3, col4, col5, col6
-> Sort (cost=21468.77..21528.70 rows=119849 width=121) (actual time=487.080..552.494 rows=122139 loops=4)
Sort Key: (api_bet.time_bucket('1 day'::interval, "date")), col2, col3, col4, col5, col6
Sort Method: external merge Disk: 20600kB
Worker 0: Sort Method: external merge Disk: 15160kB
Worker 1: Sort Method: external merge Disk: 15152kB
Worker 2: Sort Method: external merge Disk: 15176kB
-> Result (cost=0.09..14887.62 rows=119849 width=121) (actual time=0.063..140.289 rows=122139 loops=4)
-> Parallel Index Scan using _hyper_313_7688_chunk_table_7days_date_idx on _hyper_313_7688_chunk t (cost=0.09..14707.84 rows=119849 width=121) (actual time=0.057..115.494 rows=122139 loops=4)
Index Cond: (("date" >= '2022-10-22 00:00:00'::timestamp without time zone) AND ("date" < '2022-10-23 00:00:00'::timestamp without time zone))
Planning Time: 2.420 ms
Execution Time: 2503.031 ms
CAgg based on SELECT (defined higher) recalculate data for 600 - 700 days about 1hour 45 minutes: from 2 to 5 sec for a day. Is it possible to improve the work of the CAgg so that it will recalculate in 1 second for 1 day?
What additional information could I present for you?