Hi there,
we want to store IIoT data in TimescaleDB. We have used (plain) PostgreSQL the last years and want to switch to TimescaleDB. Therefore I’m completely new at TimescaleDB and do not have any experience with it. I have read a lot the last days and also watched some of the YT Videos. I would be gladeful if you could help me to do some good decisions regarding our hypertable.
One of our data table would look like below. I would expect about 20(-50) Tenants and about 1000 (-5000) devices. Each device has about 0 - 200 different “device_parameters” (e.g. energy data, operational data, …). The number in brackets is the estimated outlook for the next 5 years.
Right now the queries to the table are filtering by “device_id”, “param_name” and time range (last_change between a and b). Typical time range is between one day and one week.
Currently the backend service is providing the data “as it is” (raw). In the future a aggregation for some of the parameters maybe comes into places.
Regarding the estimates number of data: We have about 50 “device -parameter” per second for each tenant. Or in average about 0.5 - 1 “device -parameter” per second on device-level.
That means per day:
2-10 million data points per tenant
about 20k - 100k data points per device
→ about ~50 million data points per day overall (for all tenants / devices)
Our idea is to create a hyper-table like this:
SELECT create_hypertable('device_parameters', by_range('last_change', INTERVAL '1 week'));
SELECT add_dimension('device_parameters', by_hash('device_id', ??));
I have following questions regarding this setup:
- Is it a good idea to add the device_id as partition dimension?
- What could be a good “number of partition” value for this device_id dimension?
- Would it even make sense to add “param_name” as additional partition? (or would this cause to many small chunks)
- Regarding the typical access pattern (see above), would it make sense to disable creating of default index and use a combined index for (device_id, param_name, last_change DESC) instead ?
- What else tipps / hints do you have for me as newbie?
Thanks a lot in advance!
Best regards,
Tom
CREATE TABLE IF NOT EXISTS device_parameters
(
id BIGINT NOT NULL,
last_change TIMESTAMP WITH TIME ZONE NOT NULL,
buffered boolean,
created_at TIMESTAMP WITH TIME ZONE NOT NULL,
tenant_id TEXT NOT NULL,
device_id TEXT NOT NULL,
parameter_name TEXT NOT NULL,
parameter_value TEXT NOT NULL
);
edit: The “id” column was inserted, because Hibernate wants to have a “id” column.
→ Is it an issue to have this additional column with the auto-incrementing sequence?
→ The alternative would be to have some big combined primary key like PRIMARY KEY (last_change, tenant_id, device_id, parameter_name). Not sure if this makes it any better?