What is suggested backup strategy and tool to use?

ts101 · June 6, 2024, 11:42am

We have 12 TB Timescale 2.14.2 database. Currently we have so many updates on our biggest hyper-table, in order to fix incorrect data that was sent by sensors that database backup fails after three days of working with error of too much data was changed (snapshot is too big in Veem backup tool). Veem backup is great (if working) because it stores backup to tape-device where there is plenty of data capacity.

Because backup is currently not working, as temporally solution we have decided to clone whole disks (this is storage wise operation) instead of creating PostgreSQL backup. But this is not perfect solution. We can create disk clones once per day, but we don’t have enough disk storage to store more then one disk clone of data (disk clone is performed on disk storage not on tape-device). If drastic problem appears during the day, we can “revert back” to nightly disk clone, but if we find out the drastic problem appeared like more then one day ago, then we are in deep problem, because we can’t go back more then nightly disk clone.

What I am interested in, is what kind of backup tool do you use for database backups of TimescaleDB data? What kind of backup approaches do you use?

Regards

Yanis · June 6, 2024, 1:40pm

I am not sure what backup tool is best.

At least, I would warn against relying on pg_dump if you plan to be able to recover to a different version. The reason for this is that I reported this issue: "ERROR: chunk not found" and other errors when loading pg_dump from old timescale in which a pg_dump from old pg/timescale cannot be loaded without errors into new pg/timescale.
I have not received any feedback on this which I fear means the backup cannot be relied upon unless restoring to exact same versions
I would of course be more than happy to revise my position in the event that I receive new information on my post…

For your situation, have you explored PITR?

jonatasdp · June 6, 2024, 9:25pm

I think internally, in the timescale cloud we use a EBS Snapshots as the base and partial dumps to fast kick off.

Looking forward to see what other folks has to say too.

ts101 · June 7, 2024, 7:56am

@Yanis, pg_dump is utility that dumps DDL and data into file and then you can restore from file. This looks good idea for small databases. But we are currently having 12 TB database and can’t just dump the files, because also restore is going to take for ever.

Quick looking to the issue. If you are really using some very old version of TimescaleDB, then it is probably best to upgrade to the same version before using pg_dump. The closes source and target versions are to database version, the less problems can be expected.

Maybe you can report bug to official bug tracker: Issues · timescale/timescaledb · GitHub but I am afraid you will get the answer to upgrade to the latest version before using pg_dump.

Maybe this is not pure backup problem, but software incompatibility. TimescaleDB has made huge improvements over the years and may not be software compatible with old version. You haven’t specified what is the source TimescaleDB version. You can connect to the database and execute: \dx to get the extension version.

@jonatasdp, EBS Snapshots this looks like Amazon cloud solution. Nice to know there is some solution. We are currently using self-host on-prem…

I used to test pgBackRest which is third-party solution. On test system I have recognized tool was coping log files to archive for few days back. Then when I was forced to do some PostgreSQL restart, then shutdown was not finished for one hour, so I was force to do the kill database, which is bad idea. I assume this is good tool, but I have had a lack of time to really go deep inside the tool. Then it was decided to use Veem solution to backup the database.

Backup solution is probably more on the level of PostgreSQL and not so much of TimescaleDB. But big data are in latest, so probably users have more experience.

We have now decided that we stop all of the activity for a weekend to make a backup a chance to finish properly. I have also temporally disabled autovacuum, that has huge impact on disk I/O, because of intense updates. Lets hope the backup finishes successfully.

At the autumn of this year PostgreSQL v17 will be released. So far it looks like incremental backup will be supported by default out of the box. This is for sure one of the options that must not be overlooked. You know, build in solution is probably well thought and tested (maybe waiting for few minor releases, to be extra sure).

Yanis · June 7, 2024, 5:27pm

Yes, I understand that. I am just highlighting the fact that pg_dump may not actually work as expected with Timescale which is quite a bit of a surprise.

With normal postgres, pg_dump will allow you to do a logical backup to an SQL file and restore that on another postgres server with a different version. Granted, it will take a long time on a multi TB database.
However, this has been known as a safe way to restore data from a different version on postgres for a long time. (all this assumes using pg_dump to SQL file and not using the custom format)

With timescale it seems this is not possible. At least I get lots of errors when trying. I was expecting this was due to me forgetting some minor detail or having some error in my configuration. But as none seems to respond saying so in my post I will assume that this is indeed expected behavior which is quite a bit concerning to be honest…

I

Yes, I will upgrade. In fact I have upgraded. This is why I am trying to import an existing database from an old server. The old server is very old… Going through an in-place upgrade of that old existing server will likely not be pretty. And also, the hardware is supposed to be taken down so it would be a lot of wasted effort anyway…

timescale version is 1.3