Add new node or reattaching node after creating distributable hype table

Mohammed_Iliyas_pate · April 6, 2022, 6:53am

I have configured a multinode setup with 1 access node and 2 data nodes [dn1, dn2].

SELECT create_distributed_hypertable(‘locate’, ‘time’, ‘location’, replication_factor => 2);

and data node replication is working properly.

Problem 1: Detach data node [dn2] and re-attach data node [dn2]

SELECT detach_data_node(‘dn2’, hypertable => ‘locate’);
SELECT attach_data_node(‘dn2’, ‘locate’, if_not_exists=>true);

ERROR: [dn2]: relation “locate” already exists

Query 1: Is this to say that once a node has been detached, it cannot be reattached? Is there a reason?

Problem 2: Delete data node [dn2] and add new node [dn2]

Query 2: Is it always recommended to add a new data node to a cluster?
[I received an error because data already existed while adding the same node, so I dropped the database and re-added the node.]

Query 3: Is it allowable to attach a new or deleted data node [dn2] to a distributed hypertable table in order to support replication?
[Tried, but replication failed because the table timescaledb information.chunks does not contain an entry for dn2]

jfj · April 7, 2022, 7:03am

Hi @Mohammed_Iliyas_pate, Thanks for posting your question here!
Can you provide more details on why you are trying to detach and reattach the same node?

Mohammed_Iliyas_pate · April 8, 2022, 5:36am

Hi jft,

We are working on projects in which we are actively trying to figure out all of the possibilities with TimescaleDB.

For example, if a node fails at some point, we may need to remove or detach the node from the cluster.

In addition, we may need to add a new node after some time. In that case, we are simply investigating the possibility of adding a new node after the distributable hyper table has been created. I have tried adding a new data node, but it will not get any updated data from the access node (Might be because table timescaledb_information.chunks will not have any entry for the newly added data node dn2).

SELECT chunk_name, data_nodes
FROM timescaledb_information.chunks
WHERE hypertable_name = ‘locate’;

chunk_name | data_nodes
-------------------------±-----------
_dist_hyper_10_12_chunk | {dn1}

Mohammed_Iliyas_pate · April 11, 2022, 11:48am

Hi jfj,

Any suggestion?

Regards,
Mohammed Iliyas

ryanbooz · April 12, 2022, 3:05pm

@Mohammed_Iliyas_pate

Problem 1/2 - (re)attaching data nodes:
With the current implementation of attach_data_node, there is an assumption that the node does not already have the hypertable created on that node. When you detach a node, we do not remove data. Therefore, when you attach the node, TimescaleDB tries to create the table, but it already exists. That’s why you’re getting the error. At the moment, there is not an option to attach a previous data node when the table exists primarily because TimescaleDB can’t determine if the existing table/chunks on the node are consistent (schema, etc.) with the rest of the distributed nodes - among other things.

So, yes, for now you can only (re)attach a data node that does not already contain the hypertable.

Problem 3 - New data on added nodes:
By default, a distributed hypertable will use all attached data nodes, unless you specifically configure it differently with the data_nodes parameter with create_distributed_hypertable. By default, when you attached a new data node, TimescaleDB modifies the configuration to include the new node for future chunks (although this behavior can be modified by setting the repartition parameter to FALSE).

Once attached, however, no new data will show up on the new data node until new chunks would be created. For instance, if your distributed hypertable has a chunk_time_interval=>'7 days', and a chunk was recently created, you will probably not see new data on the additional data node for a few days.

TimescaleDB does not (yet) automatically redistribute chunks to newly attached data nodes. You can do it manually using move_chunk, but it does not happen automatically yet.

Hopefully that answers your current questions!

Mohammed_Iliyas_pate · April 13, 2022, 3:36am

Thank you very much @ryanbooz and yes I agree with what you replied.

As per the document, move_chunk is not yet production-ready, so is any tentative timeline planned?

As per my understanding:

Adding new data node (dn3): set chucks time interval

SELECT add_data_node(‘dn3’, ‘10.140.132.59’);
SELECT attach_data_node(‘dn3’, ‘locate’);
SELECT set_chunk_time_interval(‘locate’, 120000); // after 120000, new chunk will add dn3.

Workaround for detaching data node dn3 ( For maintenance ) and reattaching data node :

Move existing data in hypertable ( "locate ") to the new node \ any existing node.

CALL timescaledb_experimental.move_chunk(’_timescaledb_internal._dist_hyper_10_20_chunk’, ‘dn3’, ‘dn2’);
CALL timescaledb_experimental.move_chunk(’_timescaledb_internal._dist_hyper_10_21_chunk’, ‘dn3’, ‘dn2’);

Clean up after a failed move using this query. In this example, the operation ID of the failed move is ts_copy_1_31:

CALL timescaledb_experimental.cleanup_copy_chunk_operation(‘ts_copy_1_31’);
Copy

Set flag timescaledb.enable_client_ddl_on_data_nodes = TRUE in dn3 and reload.
Drop table in data node – dn3.
DROP TABLE locate
Now data node dn3 is as good as the new data node ( can be attached).

ryanbooz · April 13, 2022, 3:30pm

Adding new data node:
Again, the only comment is that modifying the chunk_time_interval is not an immediate change. TimescaleDB will still finish up the previous chunk (however long the interval is) before creating new chunks for (future) incoming data. So, while running the set_chunk_time_interval command might appear to create a chunk soon after on your table, my guess is that it’s a timing thing. Again, chunks are created as needed and their range_start and range_end timestamps are set at creation time. We will only create a new chunk if incoming data is after an existing chunk range_end timestamp, and at that time TimescaleDB would use your new chunk_time_interval setting.

Detach/Reattach data node:
Your process looks generally correct.