How to determine whether a continuous aggregation has completed computation and storage?

NieYuan · July 6, 2024, 2:53am

Create a continuous aggregation:

CREATE MATERIALIZED VIEW WATER_DOWNSAMPLING_HOUR
WITH (TIMESCALEDB. CONTINUOUS,
TIMESCALEDB.MATERIALIZED_ONLY = FALSE)
AS
SELECT TIME_BUCKET('1 hour'::INTERVAL, "time") AS RECODE_TIME,
METER_CODE,
MAX(DQLJLCC)                            AS MAX_READING
FROM WATER_METER1
GROUP BY TIME_BUCKET('1 hour'::INTERVAL, "time"), METER_CODE
WITH NO DATA;

“With NO DATA” is used here.
This view can query results, but it is very time-consuming.
Later, I realized that it might not have completed the calculation and storage, so I performed a manual refresh:

call refresh_continuous_aggregate('water_downsampling_hour', '2023-01-01', '2024-01-01');

Its execution time has become very short.
So how to determine whether a continuous aggregation has completed computation and implementation?

jonatasdp · July 8, 2024, 6:28pm

Hi @NieYuan , if you create a refresh policy, you can use the timescaledb_information.job_stats to check the jobs running and you can see when it’s completed. You can also check the timescaledb_information.invalidation_log to check pending fragments that needs to be reprocessed.

andre · November 14, 2024, 1:08pm

Hi @NieYuan , if you create a refresh policy, you can use the timescaledb_information.job_stats to check the jobs running and you can see when it’s completed. You can also check the timescaledb_information.invalidation_log to check pending fragments that needs to be reprocessed.

So the only way to check if a refresh job is still ongoing is if it’s ran through a refresh policy? If I run it manually (through refresh_continuous_aggregate) the only way to know that it is finished is waiting for the function to return, is that it?

Also, what happens if I call refresh_continuous_aggregate before a previous refresh is finished?

jonatasdp · November 14, 2024, 1:40pm

There’s a lock to guarantee only one process at a time per hypertable.

andre · November 14, 2024, 2:31pm

There’s a lock to guarantee only one process at a time per hypertable.

Is there a way to check if it is locked?

jonatasdp · November 14, 2024, 8:16pm

You can check locks in general on Postgres and filter by your use-case:

Here are a few queries that may help you:

-- Current locks, including details about blocking/waiting processes
SELECT blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocked_activity.query AS blocked_statement,
       blocking_activity.query AS blocking_statement,
       blocked_locks.mode AS blocked_mode,
       blocking_locks.mode AS blocking_mode,
       now() - blocked_activity.xact_start AS blocked_transaction_duration
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid
JOIN pg_catalog.pg_locks blocking_locks 
    ON blocked_locks.transactionid = blocking_locks.transactionid 
    AND blocked_locks.pid != blocking_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid
WHERE NOT blocked_locks.granted;

-- Simple view of locks with wait status 
SELECT pg_locks.pid,
       pg_stat_activity.usename,
       pg_class.relname,
       pg_locks.mode,
       pg_locks.granted,
       pg_stat_activity.query_start,
       pg_stat_activity.query
FROM pg_locks
LEFT JOIN pg_class ON pg_locks.relation = pg_class.oid
JOIN pg_stat_activity ON pg_locks.pid = pg_stat_activity.pid
WHERE pg_class.relname IS NOT NULL
ORDER BY pg_stat_activity.query_start;

-- Check for lock conflicts
SELECT DISTINCT ON (blocked_locks.pid)
       blocked_locks.pid AS blocked_pid,
       blocked_activity.usename AS blocked_user,
       blocked_activity.application_name AS blocked_application,
       blocked_activity.client_addr AS blocked_client_addr,
       blocked_activity.wait_event_type AS blocked_wait_event_type,
       blocked_activity.state AS blocked_state,
       blocked_locks.mode AS blocked_mode,
       blocked_activity.query AS blocked_query,
       blocking_locks.pid AS blocking_pid,
       blocking_activity.usename AS blocking_user,
       blocking_activity.application_name AS blocking_application,
       blocking_activity.client_addr AS blocking_client_addr,
       blocking_activity.state AS blocking_state,
       blocking_locks.mode AS blocking_mode,
       blocking_activity.query AS blocking_query,
       extract(epoch from now() - blocked_activity.xact_start)::bigint AS blocked_duration_seconds
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_locks.pid = blocked_activity.pid
JOIN pg_catalog.pg_locks blocking_locks 
    ON blocked_locks.locktype = blocking_locks.locktype
    AND blocked_locks.database IS NOT DISTINCT FROM blocking_locks.database
    AND blocked_locks.relation IS NOT DISTINCT FROM blocking_locks.relation
    AND blocked_locks.page IS NOT DISTINCT FROM blocking_locks.page
    AND blocked_locks.tuple IS NOT DISTINCT FROM blocking_locks.tuple
    AND blocked_locks.virtualxid IS NOT DISTINCT FROM blocking_locks.virtualxid
    AND blocked_locks.transactionid IS NOT DISTINCT FROM blocking_locks.transactionid
    AND blocked_locks.classid IS NOT DISTINCT FROM blocking_locks.classid
    AND blocked_locks.objid IS NOT DISTINCT FROM blocking_locks.objid
    AND blocked_locks.objsubid IS NOT DISTINCT FROM blocking_locks.objsubid
    AND blocked_locks.pid != blocking_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_locks.pid = blocking_activity.pid
WHERE NOT blocked_locks.granted
ORDER BY blocked_locks.pid;

-- Quick count of locks by type
SELECT pg_locks.locktype, count(*) 
FROM pg_locks 
GROUP BY pg_locks.locktype 
ORDER BY count DESC;