Time-Series Analytics

More Time-Series Data Analysis, Less Lines of Code: Meet Hyperfunctions

Do you have time-series data? If you do and have tried TimescaleDB, you know that we have already solved the issue of storing the ever-increasing amount of time-series or time-related data on top of PostgreSQL. But storing it is just one side of the medal. Once you have your data, the really important bit is analyzing it and starting to predict the future.

TimescaleDB hyperfunctions comprise a set of functions, procedures, and data types optimized for querying, aggregating, and analyzing time series. They are highly integrated with Timescale’s cloud offering and PostgreSQL’s database extension TimescaleDB. Some are included by default, and others are part of Timescale Toolkit, including some high-performance in-database Rust functions.

Hyperfunctions Wherever You Look

image

Hyperfunctions are divided into multiple categories, with the most basic being directly integrated into the TimescaleDB extension. 

This includes time bucketing using time_bucket or time_bucket_gapfill, and some standard functions, such as first and last.

Postgres Aggregate Functions and More

These functions build the basis for time-based aggregations of billions of records in real time. On top of those, as mentioned earlier, Timescale provides a second extension, the Timescale Toolkit, offering advanced, use case-specific categories of functions. 

·  Approximate count distinct functions provide estimators for distinct numbers of objects, using algorithms such as a hyperloglog++, and enable rolling up smaller time ranges into larger ones.

·  Counter and gauges functions provide aggregation functions to collect monotonically increasing values using counter_agg and gauges (gauge_agg, which can increase and decrease over time), as well as their corresponding functions to retrieve analytical values, such as the correlation coefficient, a delta, interpolated values, and slopes. They also provide functionality to roll up smaller time windows into larger ones.

·  Financial analysis functions implement a candlestick aggregation with direct access to OHLC (Open, High, Low, Close) values, opening up a quick way for analytics typically found in stocks and crypto trading use cases. Again, additional functionality is provided to roll up small time windows into larger ones.

·  Statistical and regression analysis functions enable quick access to statistical analysis methods such as average, kurtosis, skewness, standard deviation, variance, and similar. In contrast to some of the vanilla PostgreSQL variations (like stddev, average), they can be used when rolling up smaller windows into larger ones without specifically storing intermediate values to prevent incorrect re-aggregated results.

·  Percentile approximation functions provide a uddsketch implementation that enables the approximation of percentiles over large sets of data points, as well as the necessary functions to retrieve one or more percentiles simultaneously. Like the predecent example, rolling-up functionality is also provided. |

·  State tracking functions offer aggregations and tracking functions for discrete state changes over time. That includes systems, like state machines, that switch between different states (such as starting, running, stopping, and stopped). They also enable access to the time spent in each state or the lack of certain states for an extended period. Rolling up is also available.

·  Downsampling functions allow you to downsample a large set of data points into fewer ones while preserving the overall shape of the originating result set. This can be used to quickly visualize the trends. Included algorithms are lttb (Largest Triangle Three Buckets) and ASAP (Automatically Smoothing for Attention Prioritization).

Trust us, folks. The list goes on.

Get Started With Hyperfunctions for Time-Series Data Analysis

The best way to understand how the Timescale hyperfunctions can empower your data analysis is to get your hands dirty. Check out the following resources, and start crunching that time-series data: