Hi, I am creating a continuous aggregate on a pretty big table, about 2 billion rows. It seems that the continuous aggregate takes almost the rest of the free space of this disk to make the aggregate, which has been running for half a day now. This continuous aggregate parses JSON besides aggregation operations on many columns, so i’d say it’s a complex aggregate as on what it does and the number of columns. Will this take my entire disk until it collapses ? Does the disk space its taking gets freed afterwards ? Is there any general recommendation about dealing with large tables when doing aggregates ? Also, what is a common strategy to model complex aggregates ? Is there a way to test without having to do the aggregate with the entire table ? For example, make an aggregate only over a year period ?Thanks
Hi @MasterKraft, that’s a very common issue for big stuff like yours
Create the materialized view WITH NO DATA
and just paginate the refreshes.
Make your calls to refresh_continuous_aggregates
in a time window that your hardware can manage. Think about like 1 week or 1 month and keep sequentially upgrading. Every call will have better management of the memory and cpu resources. Short and more efficient reads and locks
Hey Jonatas. Thanks, the way I am doing it is to initially create the view with a shorter time-frame to compute the test views, this seems to work ok.
I will note your recommendation about the refreshes, still not there but thank you for your input as usual
Hi Jonatas, I’m back
After some testing it’s time to do this. Just to make sure :
What you meant is to create first the continuous aggregate without data, and then :
CALL refresh_continuous_aggregate(‘conditions’, ‘2020-01-01’, ‘2020-02-01’);
Call refresh like this sequentially, for example month by month until the last timestamp on the hypertable, correct ?
EDIT : @jonatasdp got it, it’s working nicely now. Thanks for your help