I am new to TimeScale ; sorry if this is something simple, but I can’t find the solution on threads and google.
Let’s say I have a table with columns:
Time
Key1
Value
I configure it as an hypertable and set the retention policy to 7 days.
Next I create a Continuous Aggregates with a time bucket to 1 hour making the group on Key1, and AVG on Value.
Now, I want now to add a Key2 column on the original table, and update the Continuous Aggregate to group on Key1 and Key2.
I have seen I can ALTER the original table to add the new column with a default value successfully.
My question is : How can I update the materialized view / Continuous Aggregate to add the column without loosing the data (because I only have 7 days on the original table).
Unfortunately, you can’t alter a continuous aggregate, you’d have to build it again. To keep the data, then you’d maybe have to create a new one with the new column and retain the old one?
There are a couple of links here that explain this subject:
I should stop the Continuous Aggregate (but keep old data)
ALTER my source hypertable
Create a new Continuous Aggregate using the newly added column
Unify previous continuous aggregate with new one created using a view or something
In this case, is it a best practices to always version the continuous aggreagtes in case of a change, and always create a view to ‘hide’ the possible multiple aggreagtes on behind ?
Like this ?
SourceTable (hypetable)
SourceTable_Daily_v1 (continous aggregate)
SourceTable_Daily (View over SourceTable_Daily_v1)
That process, I think, would work, it would probably be a little bit weird and you’d have to be sure you don’t have excess data in the first agg, also that any invalidations are picked up correctly or done the way you want.
In terms of what you want to be able to do, it sounds like you’d like to be able to add a column to a continuous aggregate with a default value for the old stuff that doesn’t have it, is that correct? We are considering ways of making continuous aggregates simpler and understanding better what you’d like might help us as we think about some further options. Ideally, that would make your question about best practices obsolete in the future, but we do have a bit of implementation work to do on that, so I can’t say when all this will happen, but we’d love your input.
Sorry for my late answer, I’ve been under some work issues (not related to Timescale, don’t worry ).
It mostly performance counters ; it should insert data only in the last hour ; but indeed stopping aggregate and creating a new one may requires some attention.
Exactly, this is new column(s) with a default value for previous entries. The current use case I have is the following :
Until now, I was measuring the performance of methods in the code. I was referencing it with 4 columns :
Assembly,
Namespace,
Class,
Method.
Now I also monitor the methods of API Controllers, so I would like to add some new columns hostname, HTTP return code, …
So for previous method’s performance counter, it does not matter, but from now on, I will have for some counters, additional values.
I’m in a similar situation. When building our hypertables, we opted to go with a wide table schema to hold a number of stats, and built continuous aggregates to consolidate historical data to lose granularity over time. We didn’t realize at the time that it would not be possible to add columns for other stats later without losing the historical data in the materialized views. Perhaps a narrow table strategy would have been wiser. Looking forward to any possible solutions the tsdb team can come up with. Happy to provide input on use cases if it’s useful.
If I have multiple stocks with identical timestamps but different IDs, I need the primary key to be a combinatory key (datetime plus asset ID) this can be done yes?
For this kind of requirement in a single node we’d suggest multiple tables. If you need more let me know and I’ll see if I can get one of the team to check in on your use case. Hope this helps.
Just to chime in that this is something that we would also appreciate a lot:
We have a large table (well, several actually) where we want to add a new column with a default value of 0, and need to migrate the materialized view (created using continuous aggregate) as well in order to preserve access to historical data. Setting a single default value for the new column for all current data in the materialized view would be perfectly acceptable.
Are there any plans for such functionality, such that it may be worth waiting for it, or should we try the work-around?
Glad I read this article because it got me thinking - as a startup, knowing that we can save time not building our own materialized view mechanism (non trivial) but at the cost of having to define that view in terms of an immutable schema is a quite scary.
But what really got me thinking was our intention to apply a data retention policy. If we do that then no matter how we build the view, we’d have a real process to go through to rebuild it - taking our raw data, rehydrating it (in a production system) and rebuilding the view.
I’m think I’m going to focus on the tiered storage approach initially? At least while our solution domain stabilizes, the thought that if we have to recreate a materialized view, we can bypass having to re-load our data seems pretty compelling.
Thank you @ldubrois for this post.
I’m in the same situation, and I know I will have to add more columns to the view in the future.
I need a comfortable process to manage it and preserve my aggregations because I have a retention policy on raw data of course, like @kelly.cliffe I would like to avoid recalculate everything and lose some of them.
Is there something new here ? What is the best approach ?