Hey,
We are using a narrow DB Schema to store arbitrary double values (changed based).
CREATE TABLE IF NOT EXISTS historian.Ms(
timestamp timestamptz,
tag_id smallint NOT NULL, value double precision NULL) WITH ( OIDS = FALSE )
TABLESPACE pg_default;
ALTER TABLE historian.Ms
OWNER to dbadmin;
DROP INDEX IF EXISTS historian.tag_id_timestamp_idx;
CREATE INDEX IF NOT EXISTS Ms_tag_id_idx ON historian.Ms (tag_id);
SELECT create_hypertable('historian.Ms', 'timestamp', if_not_exists => TRUE);
select set_chunk_time_interval('historian.Ms', INTERVAL '{LogDataTableType.Process_Data_MS.GetChunkIntervalInHours()} hours')
;
A new requirement has come up where we want each tag to be able to have a configurable rentention date. E.g. Tag1 should be kept for 3 months and Tag2 should be kept for 1 year. The typical way that the TimeScaleDb documentation recommends data deletion is via dropping chunks, but this is all or nothing.
We have tried using spatial dimensions on Tag_Id, but we could not get the chunks to only contain 1 tag.
You can absolutely do this with space partitioning:
a=# select add_dimension('historian.Ms', 'tag_id', chunk_time_interval=>1);
add_dimension
---------------------------
(2,historian,ms,tag_id,t)
(1 row)
a=# insert into historian.ms values(now(), 1, 1);
INSERT 0 1
a=# insert into historian.ms values(now(), 1, 2);
INSERT 0 1
a=# insert into historian.ms values(now() - interval '1 day', 1, 1);
INSERT 0 1
a=# insert into historian.ms values(now(), 2, 1);
INSERT 0 1
a=# insert into historian.ms values(now(), 5, 1);
INSERT 0 1
a=# \d+ historian.ms
Table "historian.ms"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+--------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
timestamp | timestamp with time zone | | not null | | plain | | |
tag_id | smallint | | not null | | plain | | |
value | double precision | | | | plain | | |
Indexes:
"ms_tag_id_idx" btree (tag_id)
"ms_timestamp_idx" btree ("timestamp" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON historian.ms FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_1_1_chunk,
_timescaledb_internal._hyper_1_2_chunk,
_timescaledb_internal._hyper_1_3_chunk
_timescaledb_internal._hyper_1_4_chunk
Access method: heap
Now should you do this - that’s a harder question. How many chunks are you likely to have if you enable space partitioning (worst case it would be total tags * total time chunks)?
We recommend extreme caution when choosing to use space partitioning.
Hi James.
Thank you very much for your response - very helpful.
I work with Jason, and have some follow-up questions I hope you can resolve:
We have circa 500 id’s (in our case Tag-id) that we would need to space-partition, meaning we would get 500 chunks per interval. Maybe that is too much to handle efficiently… but I have the following questions:
It is not possible to set the chunk_time_interval individually for each Id? The data refresh rate varies greatly, so we would end up with some chunks only having a couple datapoints, and other having millions, which isn’t very query efficient.
The chunk_time_interval parameter is set to 1 in your example - what does that mean? I assumed it would mean, that space-partitioning uses the same chunk interval as the hypertable (which is set by set_chunk_time_interval function). Correct?
Since we are using space-partitioning, is there an efficient way to locate what tagid’'s belong to which chunks? I can see there is a chunk table in the information db, and i have also noticed the tableoid::regclass function to get the chunk from the hypertable, but this will run very slowly once the database get filled up.
I hope you understand these question - if not I would be happy to elaborate.
-Filip