Toolkit 1.8.0 now available!

Brian_Rowe · July 20, 2022, 9:29pm

The Timescale Toolkit team is proud to announce the 1.8.0 version of the TimescaleDB Toolkit extension! This version is now available on Timescale Cloud, as part of our HA docker image, as a DEB or RPM package, or as source code at our github repository.

New experimental `count_min_sketch` for approximate count

One of the big features we’re particularly proud to introduce in the new toolkit version is the count_min_sketch aggregate and associated approx_count function. This aggregate, available under the toolkit_experimental schema for now, uses the Count-min sketch algorithm to apply a series of hash functions to an input column. It can then use the resulting counts to estimate the count for a particular value with much lower risk of being defeated by a hash collision. Here’s an example:

WITH t AS (
  SELECT toolkit_experimental.count_min_sketch(symbol, 0.01, 0.01) AS symbol_sketch
  FROM stocks_real_time
)
SELECT toolkit_experimental.approx_count('AAPL', symbol_sketch)
FROM t;

New interpolating accessor functions to work with PostgreSQL windows functions

We’ve also added several interpolating functions that can be used to correctly compute results with PostgreSQL window functions. This problem arises when trying to compute a value over an interval where PostgreSQL has partitioned some of the data we need in another group. Consider a data stream grouped by day and then fed into a time weighted average to get a daily weighted average. Without the last point of the previous day or the first point of the next day, each average is going to fail to correctly account for the interval before its first data point and after its last data point of the day. With the interpolating accessor, we can work around this:

SELECT
  time,
  toolkit_experimental.interpolated_average(
    time_weight,
    time,
    '1 day',
    LAG(time_weight) OVER (ORDER BY time),
    LEAD(time_weight) OVER (ORDER BY time)
  )
FROM (
  SELECT 
    time_bucket('1 day', time) as time,
    time_weight(
      'LOCF',
      time,
      value
    )
  FROM tbl
  GROUP BY time_bucket('1 day', time)
) s

Interpolating accessors have been added for rate and delta in the counter_agg and gauge_agg aggregates, average in the time_weight aggregate, and duration_in in the state_agg aggregate.

Experimental saturating math functions

We’ve also introduced a new set of saturating math functions. These allow you to perform integer math but will snap to boundary values rather than overflow.

More convenient approximate count distinct

Further, we’ve also made improvements to our HyperLogLog functionality, improving the documentation and cleaning up some error messages. As a big improvement to usability, we’ve also introduced a new approx_count_distinct aggregate which will create a HyperLogLog with a default sizing that should work well for most use cases.

1.8.0 also includes numerous fixes and quality of life adjustments for toolkit functionality. The most noteworthy among these are support for PostgreSQL environments with a missing default collation, preliminary support for the aarch64 platform, ability to have explicit NULL values in timevector objects, and upgrading our pgx and rust versions for the extension.

Conclusion

We encourage everyone to try out this new version of the toolkit and let us know what you think! We’d love to hear from you any improvements or adjustments you’d like to make to the experimental functions, or even if you like them the way they are and would like to see them moving out of our experimental schema. You can leave us a response here to this forum post, send us a note in our slack channel, or add an issue to the toolkit repository. Thank you!

aglover · July 25, 2022, 7:43am

When will 1.8.0 become available in TimescaleDB Managed Service?

Can you please advise the minimum requirements with regard to

Postgres Version 13 or 14?
Minimum TimescaleDB version 2.x.x ???

Andrew_Shegrud · July 25, 2022, 10:23pm

Hi @aglover, We don’t have a solid timeline right now, but it is safe to say that it will be available within Q3. I’ve made a note to ping you when 1.8 is available, and provide more details about minimum requirements.

Please continue to stay tuned to into this forum for more updates!

Brian_Rowe · July 29, 2022, 5:33pm

As to the rest of your question @aglover, TimescaleDB Toolkit is tested to work on Postgres 12, 13, and 14. There is no dependency between toolkit and the main TimescaleDB extension.