I did some tests. I created hyper-table and added dimension partitioned by_hash:
Commands to try out a sample.
DROP TABLE tab1;
CREATE TABLE tab1 (
time timestamp with time zone not null,
company_id smallint not null,
measurement numeric(15,4)
);
SELECT create_hypertable('tab1', by_range('time'));
SELECT add_dimension('tab1', by_hash('company_id', 7));
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP, 1, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '1 HOUR', 1, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '2 HOUR', 1, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '3 HOUR', 1, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '4 HOUR', 1, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP, 2, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '1 HOUR', 2, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '2 HOUR', 2, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '3 HOUR', 2, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '4 HOUR', 2, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP, 3, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '1 HOUR', 3, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '2 HOUR', 3, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '3 HOUR', 3, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '4 HOUR', 3, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP, 4, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '1 HOUR', 4, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '2 HOUR', 4, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '3 HOUR', 4, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '4 HOUR', 4, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP, 5, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '1 HOUR', 5, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '2 HOUR', 5, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '3 HOUR', 5, 100);
INSERT INTO tab1 VALUES (CURRENT_TIMESTAMP - INTERVAL '4 HOUR', 5, 100);
Then I have checked chunks created:
\d+ tab1
and to my surprise only 4 chunks were created:
List of chunks
Table "public.tab1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-------------+--------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
time | timestamp with time zone | | not null | | plain | | |
company_id | smallint | | not null | | plain | | |
measurement | numeric(15,4) | | | | main | | |
Indexes:
"tab1_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON tab1 FOR EACH ROW EXECUTE FUNCTION _timescaledb_functions.insert_blocker()
Child tables: _timescaledb_internal._hyper_8_15674_chunk,
_timescaledb_internal._hyper_8_15675_chunk,
_timescaledb_internal._hyper_8_15676_chunk,
_timescaledb_internal._hyper_8_15677_chunk
Access method: heap
Then I have selected data from each of chunk:
Selects on table
select * from _timescaledb_internal._hyper_8_15674_chunk;
select * from _timescaledb_internal._hyper_8_15675_chunk;
select * from _timescaledb_internal._hyper_8_15676_chunk;
select * from _timescaledb_internal._hyper_8_15677_chunk;
and the result is:
Result of select statements from each of chunks
time | company_id | measurement
-------------------------------+------------+-------------
2025-03-31 13:38:29.371534+02 | 1 | 100.0000
2025-03-31 12:38:29.384496+02 | 1 | 100.0000
2025-03-31 11:38:29.388986+02 | 1 | 100.0000
2025-03-31 10:38:29.392051+02 | 1 | 100.0000
2025-03-31 09:38:29.394134+02 | 1 | 100.0000
time | company_id | measurement
-------------------------------+------------+-------------
2025-03-31 13:38:29.396605+02 | 2 | 100.0000
2025-03-31 12:38:29.405172+02 | 2 | 100.0000
2025-03-31 11:38:29.409638+02 | 2 | 100.0000
2025-03-31 10:38:29.41243+02 | 2 | 100.0000
2025-03-31 09:38:29.41703+02 | 2 | 100.0000
2025-03-31 13:38:29.446693+02 | 4 | 100.0000
2025-03-31 12:38:29.44905+02 | 4 | 100.0000
2025-03-31 11:38:29.451623+02 | 4 | 100.0000
2025-03-31 10:38:29.454177+02 | 4 | 100.0000
2025-03-31 09:38:29.45675+02 | 4 | 100.0000
(10 rows)
time | company_id | measurement
-------------------------------+------------+-------------
2025-03-31 13:38:29.421306+02 | 3 | 100.0000
2025-03-31 12:38:29.429895+02 | 3 | 100.0000
2025-03-31 11:38:29.436674+02 | 3 | 100.0000
2025-03-31 10:38:29.441035+02 | 3 | 100.0000
2025-03-31 09:38:29.44416+02 | 3 | 100.0000
(5 rows)
time | company_id | measurement
-------------------------------+------------+-------------
2025-03-31 13:38:29.459023+02 | 5 | 100.0000
2025-03-31 12:38:29.468107+02 | 5 | 100.0000
2025-03-31 11:38:29.470963+02 | 5 | 100.0000
2025-03-31 10:38:29.475036+02 | 5 | 100.0000
2025-03-31 09:38:29.477599+02 | 5 | 100.0000
(5 rows)
In second table company_id 2 and 4 are saved in the same table. @jonatasdp, it is not really true that each of the company_id is going to be stored in separate table.
How does function by_hash really works? It looks to me some hash is generated like returning integer value and then some modulus is created. In this case both company 2 and 4 generate the same value to be saved into the same partition.
QUESTION: We have 6 companies. Is there a way using additional dimensions I can force that data in individual chunk only data from single company is stored?