Cannot use column for both ordering and segmenting

Hi! It was updated TimescaleDB to 2.16.0 on PG 15.2. I planned to check FK to hypertable but focused with situation that I cannot make hypertable to be compressed.
Here the script to reproduce:

DROP TABLE IF EXISTS ref_hyper;

CREATE TABLE ref_hyper(
	id NUMERIC,
	col1 TEXT,
	modify_date TIMESTAMP,
	PRIMARY KEY (id, modify_date)
);

SELECT * FROM create_hypertable(
	relation => 'ref_hyper',
	dimension => by_range('modify_date')
);

ALTER TABLE ref_hyper
SET
(
	timescaledb.compress,
	timescaledb.compress_orderby = 'id'
);

Output:

SQL Error [42601]: ERROR: cannot use column "id" for both ordering and segmenting
  HINT: Use separate columns for the timescaledb.compress_orderby and timescaledb.compress_segmentby options.

Does this mean that it is now mandatory to specify the compress_segmentby?

If to switch on only compression then “id” used by default for segmentby_column although it is not required:

ALTER TABLE ref_hyper SET(timescaledb.compress);

Table timescaledb_information.compression_settings:

hypertable_name|attname    |segmentby_column_index|orderby_column_index|orderby_asc|orderby_nullsfirst|
---------------+-----------+----------------------+--------------------+-----------+------------------+
ref_hyper      |id         |                     1|                    |           |                  |
ref_hyper      |modify_date|                      |                   1|false      |true              |

Then I tried to add empty compress_segmentby and then compression was applied with needed params:

ALTER TABLE ref_hyper
SET
(
	timescaledb.compress,
	timescaledb.compress_orderby = 'id',
	timescaledb.compress_segmentby = ''
);

Table timescaledb_information.compression_settings:

hypertable_name|attname    |segmentby_column_index|orderby_column_index|orderby_asc|orderby_nullsfirst|
---------------+-----------+----------------------+--------------------+-----------+------------------+
ref_hyper      |id         |                      |                   1|true       |false             |
ref_hyper      |modify_date|                      |                   2|false      |true              |

Not sure if it’s mandatory but it’s highly recommended for performance reasons.

The segments will always help to filter out the right batches of rows to decompress.

You’re querying, generally your segmentby will be composed of the most used columns in your query.

Example:

segmentby: id
orderby: time desc

Segments are used to define what will be batched and compressed together, so they’ll group the series in the right way and the orderby will allow the internal batch to come already in the right order.

So, if you put time in the orderby it will help to filter out the latest record of some specific point in time.

Hi, @jonatasdp !

Thanks for your reply. I remember about segment columns that they speed up queries but there are tables that have not columns for segmentation (id in my case is PK and it fits only to orderby).

Before update I was able to set compression without segmentation. Therefore I wrote this topic to understand is this new behavior that will be permanent or temporary?

Hi Denis, sorry for the late reply.

I think it’s a permanent improvement but let’s double check with @sven.

Yes, a column can either be orderby or segmentby but not both. This has always been true but it might not have been enforced in certain situations in previous versions.

The main question: why does the next setting throw error?

ALTER TABLE ref_hyper
SET
(
	timescaledb.compress,
	timescaledb.compress_orderby = 'id'
);

if you do not specify compress_segmentby a default value will be choosen, seems like the default chosen for compress_segmentby overlaps with your compress_orderby configuration. This is a bug in the default logic, but you can work around it by setting explicit compress_segmentby

1 Like

I hope that this bug will be fixed, because this is not an intuitive configuration behavior.
Thank you @sven for reply.