I’m in the process of migrating from Timestream to Timescale Cloud and want to get off on the right foot for ingestion and real-time aggregates.
I get meter data with an accumulating “consumption” value, meaning inbound data looks like (meter_id, total_consumption, measured_at)
. When we use the data, I need to also know the delta consumption between two adjacent measurements, such that I can easily tell net consumption for a given meter in a given time window.
So given the measurements
(meter_id, total_consumption, measured_at)
(4, 10, 2024-07-11 20:40:00)
(4, 14, 2024-07-11 20:50:00)
(4, 18, 2024-07-11 21:10:00)
(4, 20, 2024-07-11 21:20:00)
The the hourly aggregates would look like
(meter_id, delta_consumption, measured_hour)
(4, 4, 2024-07-11 20:00:00) -- window 1 (14 - 10)
(4, 6, 2024-07-11 21:00:00) -- window 2 (18 - 14 + 20 - 18)
I can solve this in two ways:
-
A trigger on INSERT finds most recent earlier record and calculates the delta. This will cause challenges on late data, solvable but not great. The materialized view would be easy to make.
-
Use a LAG function in a view that’s used to calculate the deltas for the materialized view. This method would be resilient to late arrivals.
However, I’m unsure why the continuous aggregate will recalculate the subsequent window 2 in case of a late arrival, such as (4, 16, 2024-07-11 20:55:00)
, into window 1. In a small test I can confirm that window 2 does get recalculated, I’d appreciate understanding why that happens, what triggers a recalculation for other windows than the one to which a new record is added?
What would a recommended approach be?