Querying latest few events on a continuous aggregate is very slow

ianf · January 10, 2023, 2:05pm

Hi, I am seeing some surprisingly slow query times when trying to select the latest events on my continuous aggregate.

These timeseries are about 0.1Hz, so 1 event every 10 sec, and new data is coming in all the time. The db in question is running timescale 2.6.0, self hosted in AKS.

This query to get the latest 2 events takes 88 seconds:
SELECT * FROM vector_events_agg_5m WHERE vector_stream_id = 12194 ORDER BY agg_interval DESC limit 2;
https://explain.depesz.com/s/VRBi7
So the query seems to cause the planner to go and get hundreds of thousands of aggregates, before returning the 2 that are needed.

By comparison, this similar query, just filtering that the events must be in the last 10 days takes 2.5 seconds (still not great, but a lot better)
SELECT * FROM vector_events_agg_5m WHERE vector_stream_id = 10615 AND agg_interval > '20230101' ORDER BY agg_interval DESC limit 2;
https://explain.depesz.com/s/m4ci

I have also tested on another server running 2.9.0, where querying without the extra filter is clearly better than 2.6.0
https://explain.depesz.com/s/pAdj
While I don’t think 4.5 sec is all that fast, it is certainly a huge improvement over timescale 2.6.0

The db servers are generally healthy, policies are running as they should, and the continuous agg high watermark is always about 10-15 min in the past.
The continuous aggregate policy has a 10 minute end_offset, 30 day start_offset, 10 minute schedule_interval.
There is an ongoing ingest of data, so in most cases the latest 2 events will be within the 10 minutes that have not yet been materialized. Some ingest will be late-arriving but never older than 30 days.

The definition is as follows:

CREATE MATERIALIZED VIEW vector_events_agg_5m
WITH (timescaledb.continuous) AS
SELECT ev.vector_stream_id, time_bucket('5 minute', ev.event_time) AS agg_interval, SUM(ev.event_data0) as sum0, MIN(ev.event_data0) as min0, MAX(ev.event_data0) as max0, COUNT(ev.event_data0) as count
FROM vector_events ev
WHERE ev.event_data0 != 'NaN'
GROUP BY vector_stream_id, agg_interval;

While in most cases that latest events will be very recent, will be a few where the latest aggregate may be some weeks or months in the past - so it’s not possible to always include the ‘after now - 1 week’ filter.

If I test on another db running 2.6.0, where there is no live ingest of data then the query response when getting latest events is quite good, even without the lower bound.
https://explain.depesz.com/s/63lp
I suppose that in this case, there are stats to guide the planner to look in the latest materialized chunk.

So my question is, how can I best improve my query, or continuous agg definition, or even my planner settings to improve my results? Taking 90 seconds for a query like this is worse than useless, as the query puts a big load on the server, while the dashboards relying on it will timeout or cause users to spam refresh.
I hope our servers will all be running 2.9.0 in the next couple of weeks but even then, 4.5 sec is quite a lot worse than I’d have hoped for. Intuitively I feel like if the query planner made better choices it should usually be possible to return in under 1 sec.

I don’t know how to formulate this, but can I write a select query to say

SELECT * FROM vector_events_agg_5m WHERE stream_id=1234 AND agg_interval > *cagg_highwatermark* LIMIT 2
*if this does not yield 2 rows then union with*
SELECT * FROM vector_events_agg_5m WHERE stream_id=1234 AND agg_interval < *cagg_highwatermark* LIMIT 2

jonatasdp · January 10, 2023, 2:35pm

Hi @ianf, I know how frustrating it feels when you don’t reach the performance you want.

Let’s dive into some details here. As you’re already aware, you need to use a filter by time to minimize the chunk selection. You can also be reducing chunk size to use less memory and fast lookup into data. If you’re compressing data, a good strategy is also be using segment_by and order entries.

Note that IndexCond can be very expensive if the indices are in large datasets. I’d recommend you combine the stream_id and time to have more efficient index strategy instead of lookup all the data.

Also, think about the materialized views (caggs) as indices that can help you to get the latest information. If you create a daily aggregation for your vector_events, you can fastly find the latest day that contains data for a specific stream_id and just filter by that day.

ianf · January 11, 2023, 1:48pm

I had a look at the chunk sizes, the underlying raw data is divided into 1 day chunks, each about 1.5GB in size, server RAM is 8GB, so that should be fine I think.
This 5minute continuous agg has the default timescale chunk size of 10 days. However now that I look at the chunk size on the cagg
SELECT chunk_name, pg_size_pretty(total_bytes) AS size FROM chunks_detailed_size('_timescaledb_internal._materialized_hypertable_11') ORDER BY total_bytes desc;
I’m a bit surprised to see that 35 of the 90 chunks are > 3GB in size, and 15 are > 4GB. So that could be a problem I hadn’t considered…
The raw event row schema was designed to be as lean as possible: int, timestamp, double, and one btree index on stream_id,timestamp
Whereas the continuous agg has int, timestamp, double x3, bigint and 2 btree indexes - one on timestamp (required by timescale) and one on streamid, timestamp which I use. So the table size is quite a lot larger than I had expected. I had made an initial assumption that a 5 min agg would be 1/300th the size of the raw table given ingest of 1Hz, so that was some way off

I don’t think I can change the row size on my cagg since the requirement is to get min, max, sum and count. Is it possible to alter the timerange of the continuous aggregate chunks? If necessary I can drop the existing aggregate and recreate.

If you create a daily aggregation for your vector_events, you can fastly find the latest day that contains data for a specific stream_id and just filter by that day.

I’m not sure I understood this - we already have a 1 day continuous aggregate, so should I use the faster result from this agg to guide the lower bound of the query on the 5min aggregate.
Then I can try to express the query with a subselect like this:

SELECT * FROM vector_events_agg_5m WHERE vector_stream_id=1450 AND agg_interval >= 
(SELECT agg_interval FROM vector_events_agg_1day WHERE vector_stream_id=1450 ORDER BY agg_interval DESC OFFSET 1 LIMIT 1) 
ORDER BY agg_interval desc LIMIT 2;

This takes about 8 seconds on timescale 2.6.0, here’s the analyze result. YVJi : Latest 2 events on 5min continuous agg, with lower bound set from result of latest 2 events from 1 day continuous agg. | explain.depesz.com
I really hadn’t considered using coarser grained aggregates like this, it’s 10x faster than my first query, but still quite slow. Perhaps reducing the chunk size on the continuous agg will help more

jonatasdp · January 11, 2023, 7:05pm

Hi @ianf, nice discoveries!

I’m not sure I understood this - we already have a 1 day continuous aggregate, so should I use the faster result from this agg to guide the lower bound of the query on the 5min aggregate.

Yes!

I think if you use a materialized CTE it will be faster because it will be processed once:

Something like:

WITH MATERIALIZED last AS (
  SELECT agg_interval FROM vector_events_agg_1day WHERE vector_stream_id=1450 ORDER BY agg_interval DESC OFFSET 1 LIMIT 1)
SELECT * FROM  last, vector_events_agg_5m WHERE vector_stream_id=1450 AND agg_interval >= last.agg_interval

I haven’t tested the syntax here, but it’s something along these lines.

Is it possible to alter the time range of the continuous aggregate chunks? If necessary I can drop the existing aggregate and recreate it.

I’m not sure if you’ll be able to make it without recreating it.

ianf · January 13, 2023, 3:06pm

To update, now I have upgraded the db from 2.6.0 to 2.9.1 (ran into another problem running ALTER EXTENSION here, but I’ll leave that to another topic)

My query performance is still pretty lousy, (30 sec+) regardless of whether I query using the simple approach or using the 1 day aggregate as a guide.

WITH last AS MATERIALIZED (SELECT agg_interval FROM vector_events_agg_1day WHERE vector_stream_id=12194 ORDER BY agg_interval DESC OFFSET 1 LIMIT 1)
SELECT * FROM  last, vector_events_agg_5m AS agg WHERE vector_stream_id=12194 AND agg.agg_interval >= last.agg_interval LIMIT 2;

I really don’t understand this right now, but I will just keep going and try reducing chunk size on my continuous aggregate.

About resizing the continuous aggregate the only way I have found to do this is using set_chunk_time_interval eg:
SELECT set_chunk_time_interval('_timescaledb_internal._materialized_hypertable_55', INTERVAL '4 days');
This works, but is it only possible to alter chunk range interval on an existing continuous aggregate? (ie not possible in the create script itself). So I am just wondering how I can do it in my db create scripts - since the function call requires that I pass the hypertable name of the cagg.

Anyway, thank you, I will let you know how I get on with resizing.

davidkohn · January 13, 2023, 8:50pm

Hello!

This looks like a bug in TimescaleDB to me! There’s a couple of things going on here I think, but likely one of them is this: [Bug]: Very slow performance of cagg_watermark function, when running multiple queries in parallel on cagg with real time aggregation on · Issue #4699 · timescale/timescaledb · GitHub

There’s also the fact that you may be on the old continuous aggregates, which definitely is not going to help things.

The final one might be a new bug, the union you’re talking about is, at least according to my understanding, exactly what we’re supposed to be doing in general with continuous aggregates. The fact that it works better for a weird case might be something odd going on. We also shoudl have a better way of having a continuous aggregate that * only* stores the last n most recent things. That’s a big one that I want to work on and which is that it looks like you might have found a case where something is making the optimization that does exactly what you’re saying around the cagg watermark stuff.

So what I’m going to do is ask someone from the DB team to take a bit more of a look at this and then maybe ask you to file a bug on Github.

Thanks for this!

ianf · January 17, 2023, 10:17am

Thanks, I will do that, I’m not sure if I will have much for repro steps though, but I have a couple of thoughts about how our pattern of ingest may impact the result.

Now I have upgraded to 2.9.1, dropped, recreated and refreshed my continuous aggregates, and am really puzzled that the query time hasn’t improved as expected.

I am in an absurd situation where I can enter the following query:

WITH latest_day AS MATERIALIZED (SELECT agg_interval day_hint FROM vector_events_agg_1day WHERE vector_stream_id=13390 ORDER BY agg_interval DESC OFFSET 1 LIMIT 1)
SELECT * from latest_day;

Which takes < 1 second to return the date ‘2023-01-16 00:00:00+00’

Then I can read the result from the above and type in the following query:
SELECT agg5m.* FROM vector_events_agg_5m agg5m WHERE agg5m.vector_stream_id=13390 AND agg5m.agg_interval >= '2023-01-16 00:00:00+00' ORDER BY agg5m.agg_interval DESC LIMIT 2;

and again get the in under a second.

But, if I combine the two into a single query:

WITH latest_day AS MATERIALIZED (SELECT agg_interval day_hint FROM vector_events_agg_1day WHERE vector_stream_id=13390 ORDER BY agg_interval DESC OFFSET 1 LIMIT 1)
SELECT agg5m.* FROM latest_day,vector_events_agg_5m agg5m WHERE agg5m.vector_stream_id=13390 AND agg5m.agg_interval >= latest_day.day_hint ORDER BY agg5m.agg_interval DESC LIMIT 2;

It takes 40 seconds, so it’s actually faster to type the 2 queries by hand!

For now I have altered the caggs to use MATERIALIZED ONLY, so that our responses for these latest aggregates are super fast, but not entirely accurate, since our aggregate refresh policy runs every 10 mins.