How time_bucket works with daylight savings

jenilynsanchez · August 15, 2024, 1:53am

Hi, I’m trying to get a clearer understanding of how daylight savings time impacts the time_bucket function, as mentioned in the Timescale documentation [here]. The document states:

daylight savings time boundaries means that the amount of data aggregated into a bucket after such a cast can be irregular

I have a time series with 30-minute intervals, and I’ve tested three scenarios around the daylight savings time change on October 3, 2021, when clocks were set forward by 1 hour at 02:00:00.

Scenario 1: time_bucket('30 minutes', interval_datetime, 'Australia/Melbourne')

Result are REGULAR

2021-10-02 15:00:00.000000 +00:00
2021-10-02 15:30:00.000000 +00:00
2021-10-02 16:00:00.000000 +00:00
2021-10-02 16:30:00.000000 +00:00
2021-10-02 17:00:00.000000 +00:00
2021-10-02 17:30:00.000000 +00:00
2021-10-02 18:00:00.000000 +00:00
2021-10-02 18:30:00.000000 +00:00

Scenario 2: time_bucket('1 hour', interval_datetime, 'Australia/Melbourne')

Result are REGULAR

2021-10-02 14:00:00.000000 +00:00
2021-10-02 15:00:00.000000 +00:00
2021-10-02 16:00:00.000000 +00:00
2021-10-02 17:00:00.000000 +00:00
2021-10-02 18:00:00.000000 +00:00

Scenario 3: time_bucket('2 hours', interval_datetime, 'Australia/Melbourne')

Result are IRREGULAR

2021-10-02 14:00:00.000000 +00:00
2021-10-02 16:00:00.000000 +00:00 <-- irregularity (aggregated data is only 16:00 - 17:00)
2021-10-02 17:00:00.000000 +00:00 <-- irregularity (aggregated date spans 17:00 - 19:00)
2021-10-02 19:00:00.000000 +00:00

Are these scenarios expected behavior? I was expecting some irregularity with the 1-hour interval, but it seems to appear only with the 2-hour interval. I’d like to understand how the time_bucket function handles these situations.