JOIN ON time_bucket_gapfill doesn't return data from both tables

I have two tables with ‘sparse’ sensor readings. Example:

CREATE TABLE IF NOT EXISTS public.t1
(
    "timestamp" timestamp with time zone NOT NULL DEFAULT now(),
    value numeric(3,1) NOT NULL,
    CONSTRAINT t1_pkey PRIMARY KEY ("timestamp")
)

CREATE TABLE IF NOT EXISTS public.t2
(
    "timestamp" timestamp with time zone NOT NULL DEFAULT now(),
    value numeric(2,0) NOT NULL,
    CONSTRAINT t2_pkey PRIMARY KEY ("timestamp")
)

INSERT INTO t1 (timestamp, value) VALUES ('2022-02-22 14:00', 17.5);
INSERT INTO t2 (timestamp, value) VALUES ('2022-02-22 14:10', 3);

And I want to join the two tables, with gapfilled time. Why does the following not worK:

SELECT time_bucket_gapfill('5 minute',t1.timestamp) as time, 
	avg(t1.value) as v1, avg(t2.value) as v2
FROM t1
LEFT JOIN t2 ON  time_bucket_gapfill('5 minute',t1.timestamp) = time_bucket_gapfill('5 minute',t2.timestamp)
WHERE t1.timestamp >= '2022-02-22 14:00' AND t1.timestamp <='2022-02-22 14:15'
GROUP BY time_bucket_gapfill('5 minute',t1.timestamp)

I get:
time | v1 | v2
------------------------±--------------------±—
2022-02-22 14:00:00+00 | 17.5000000000000000 |
2022-02-22 14:05:00+00 | |
2022-02-22 14:10:00+00 | |
2022-02-22 14:15:00+00 | |

I would have expected to get the value 3 for v2 in the time bucket at 14:10.

Why is this query not returning what I expect, and what is the recommended way of getting data from both tables?

2 Likes

I’m also have problems joining tables on time buckets where there will be gaps in the result if there is data missing in one of them. Did you ever solve this?

Hi folks, thanks for reporting this.

It seems the problem is related to having multiple gapfills in the same select. Using CTE it works:

WITH tbg1 AS (
  SELECT time_bucket_gapfill('5 minute',
         t1.timestamp,
         '2022-02-22 14:00:00-03'::timestamptz,
        '2022-02-22 14:15:00-03'::timestamptz) as time,
        AVG(t1.value) as v1
  FROM t1
  GROUP BY 1),
tbg2 as (
  SELECT time_bucket_gapfill('5 minute',
         t2.timestamp,
         '2022-02-22 14:00:00-03'::timestamptz,
        '2022-02-22 14:15:00-03'::timestamptz) as time,
        AVG(t2.value) as v2
FROM t2
GROUP BY 1
)
SELECT * FROM tbg1 JOIN tbg2 on tbg1.time = tbg2.time

result:


┌────────────────────────┬─────────────────────┬────────────────────────┬────────────────────┐
│          time          │         v1          │          time          │         v2         │
├────────────────────────┼─────────────────────┼────────────────────────┼────────────────────┤
│ 2022-02-22 14:00:00-03 │ 17.5000000000000000 │ 2022-02-22 14:00:00-03 │                    │
│ 2022-02-22 14:05:00-03 │                     │ 2022-02-22 14:05:00-03 │                    │
│ 2022-02-22 14:10:00-03 │                     │ 2022-02-22 14:10:00-03 │ 3.0000000000000000 │
└────────────────────────┴─────────────────────┴────────────────────────┴────────────────────┘
(3 rows)