We’ve had a case where a customer encountered an issue where some of their dates have different millisecond precision, resulting in them inserting duplicate records into their database.Example
Example:
id event_date
12345678910abcdefghij 2022-04-12 00:36:03.96
12345678910abcdefghij 2022-04-12 00:36:03.959
In this, the duplicate should have been rejected based on the id, but because the timestamp had a rounding issue, a duplicate record was inserted. Given that on a hypertable, unique constraints must be compounded with the event_date, how can we get around an issue like this where event_date might be slightly different?
Note that UNIQUE and PRIMARY constraints must include the partitioning key.
For testing:
tsdb=> CREATE TABLE "public"."events" (
tsdb(> "id" text NOT NULL,
tsdb(> "timestamp" timestamp NOT NULL,
tsdb(> "event_date" timestamp NOT NULL,
tsdb(> "event_id" numeric,
tsdb(> "created_at" timestamptz(3) DEFAULT CURRENT_TIMESTAMP(3)
tsdb(> );
CREATE TABLE
tsdb=> -- Turn into hypertable
tsdb=> SELECT create_hypertable('events','event_date');
create_hypertable
----------------------
(21,public,events,t)
(1 row)
If you try to insert duplicate data, you still can:
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.96', '2022-04-12 00:36:03.96', '95');
INSERT 0 1
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.959', '2022-04-12 00:36:03.959', '96');
INSERT 0 1
tsdb=>
A way to get around this is to create a composite unique index over (id, event_date). With regards to the milliseconds issue, we can change the data type of the event_date to store seconds only (for example : timestamp(0) ) or may be trim the milliseconds using a trigger.
tsdb=> CREATE TABLE "public"."events" (
tsdb(> "id" text NOT NULL,
tsdb(> "timestamp" timestamp NOT NULL,
tsdb(> "event_date" timestamp(0) NOT NULL,
tsdb(> "event_id" numeric,
tsdb(> "created_at" timestamptz(3) DEFAULT CURRENT_TIMESTAMP(3)
tsdb(> );
CREATE TABLE
tsdb=> -- Turn into hypertable
tsdb=> SELECT create_hypertable('events','event_date');
create_hypertable
----------------------
(24,public,events,t)
(1 row)
Now if you try to insert duplicate records, an ERROR will be thrown:
tsdb=> CREATE UNIQUE INDEX idxtest ON events ("id", "event_date");
CREATE INDEX
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.96', '2022-04-12 00:36:03.96', '95');
INSERT 0 1
tsdb=> INSERT INTO events VALUES
tsdb-> ('12345678910abcdefghij', '2022-04-12 00:36:03.959', '2022-04-12 00:36:03.959', '96');
ERROR: duplicate key value violates unique constraint "_hyper_24_13_chunk_idxtest"
DETAIL: Key (id, event_date)=(12345678910abcdefghij, 2022-04-12 00:36:04) already exists.
tsdb=>