Jun 28, 2024
Posted by
Oleksii Kliukin
Your database needs reliable backups. Data loss events could occur at any point in time. A developer may drop a table by mistake, even replicated storage drives may fail and start producing read errors, software bugs may cause silent data corruption, applications may perform incorrect modifications, and so on.
You may hope this will not occur to you, but hope is not a strategy. The key to reliably preventing data loss or enabling data recovery is to perform regular backups. In this post, we’ll cover how we deploy PostgreSQL on Kubernetes (for our TimescaleDB instances) and how this deployment choice impacts our backup and restore process.
For added interest and to spark the debate on how to build fault tolerance into your PostgreSQL database management system, we also share how we did backup and restore testing before implementing a more streamlined solution.
As the ancient proverb goes: “There are two kinds of people. Those who do backups and those who will do backups.”
Relational databases like PostgreSQL support continuous archiving, where, in addition to the image of your data directory, the database continuously pushes changes to backup storage.
Some of the most popular open-source tools for PostgreSQL center around performing backups and restores, including pgBackRest, Barman, wal-g, and more, which underscores the importance of doing so. Or, at the very least, that backup/restore is top of mind for many developers. And, because TimescaleDB is built on PostgreSQL, all your favorite PostgreSQL tools work perfectly well with TimescaleDB.
Most of the PostgreSQL tools mentioned above are not described as backup tools but as disaster recovery tools. Because when disaster strikes, you are not really interested in your backup but rather in the outcome of restoring it.
And sometimes you need to work really hard to recover from what otherwise would be a disaster: one of my favorite talks by long-time PostgreSQL contributor Dimitri Fontaine describes the process of recovering data from the PostgreSQL instance with a backup that couldn’t be restored when needed. It’s a fascinating story and even with the help of world-class experts, it’s an almost certain data loss.
We thought about how to apply this lesson to Timescale Cloud, our cloud-native platform for TimescaleDB, which is deployed on Kubernetes. A core tenet of Timescale Cloud is to provide a worry-free experience, especially around keeping your data safe and secure.
Behind the scenes, among other technologies, Timescale Cloud relies on encrypted Amazon Elastic Block Storage (EBS) volumes and PostgreSQL continuous archiving. You can read how we made PostgreSQL backups 100x faster via EBS snapshots and pgBackRest in this post.
Let's start by briefly describing how we run PostgreSQL databases on Kubernetes containers at Timescale.
We refer to a TimescaleDB instance available to our customers as a TimescaleDB service. (Fun fact in terminology: in PostgreSQL, this database instance is referred to as a PostgreSQL “cluster” as one can traditionally run multiple logical databases within the same PostgreSQL process, which also should not be confused with a “cluster” of a primary database and its replicas. So let’s just refer to these things as “databases” or “instances” for now.)
A TimescaleDB service is constructed from several Kubernetes components, such as pods and containers running the database software, persistent volumes holding the data, Kubernetes services, and endpoints that direct clients to the pod.
We run TimescaleDB instances in containers orchestrated by Kubernetes. We have implemented a custom TimescaleDB operator to manage a large fleet of TimescaleDB services, configuring and provisioning them automatically.
Similar to other Kubernetes operators, a TimescaleDB operator provides a Kubernetes custom resource definition (CRD) that describes a TimescaleDB deployment. The operator converts the YAML manifests defined by the TimescaleDB CRD into TimescaleDB services and manages their lifecycle.
TimescaleDB pods take advantage of Kubernetes sidecars, running several containers alongside the database. One of the sidecars runs pgBackRest, a popular PostgreSQL backup software, and provides an API to launch backups, both on-demand and periodic, triggered by Kubernetes cron jobs. In addition, the database container continuously archives changes in the form of write-ahead logging (WAL) segments, storing them on Amazon S3 in the same location as the backups.
In addition to the TimescaleDB operator, there is another microservice whose task is to deploy TimescaleDB instances (known as “the deployer”). The deployer defines TimescaleDB resources based on users’ choices and actions within the cloud console’s UI and creates TimescaleDB manifests, letting the operator pick them up and provision running TimescaleDB services.
The deployer also watches for changes in Kubernetes objects that are part of the resulting TimescaleDB service and the manifest itself. It detects when the target service is fully provisioned or when there are changes to be made to the running service (e.g., to provision more compute resources or to upgrade to a new minor version of TimescaleDB). Finally, it also marks the service as deleted upon receiving a delete event from the manifest.
In the previous section, we established that the deployer and operator work together to deploy and manage a TimescaleDB service in Kubernetes, including the container running PostgreSQL and TimescaleDB and the container sidecars running pgBackRest and others.
Sometimes, a solution to one problem is a by-product of working on another problem. As we built Timescale, we easily implemented several features by adding the ability to clone a running service, producing a new one with identical data. That process is similar to spawning a replica of the original database, except that at some point, that replica is “detached” from the former primary and goes a separate way.
At the time, we added the ability to continuously validate backups through frequent smoke testing using a similar approach. This is how it worked: a restore test produced a new service with the data from an existing backup, relying on PostgreSQL point-in-time recovery (PITR). When a new test service was launched, it restored the base backup from Amazon S3 and replayed all pending WAL files until it reached a pre-defined point in time, where it detached into a stand-alone instance.
Under the hood, we used (and still use) Patroni, a well-known PostgreSQL high-availability solution template, to replace a regular PostgreSQL bootstrap sequence with a custom one that involves restoring a backup from Amazon S3. If you want to go into the weeds of how we enable high availability in PostgreSQL, check out this article.
A feature of Patroni called “custom bootstrap” allows defining arbitrary initialization steps instead of relying on the PostgreSQL bootstrap command initdb
. Our custom bootstrap script also called pgBackRest, pointing it to the backup of the instance we were testing. (Side note: my colleague Feike Steenbergen and I were among the initial developers of Patroni earlier in our careers, so we were quite familiar with how to incorporate it into such complex workflows.)
Once we had verified the backup could be restored without errors, we determined whether we had the right data. We checked two properties of the restored backup: recentness and consistency. Since the outcome of the restore is a regular TimescaleDB, those checks simply ran SQL queries against the resulting database.
Obviously, we have no visibility into users’ data to verify the restored backup is up-to-date. So to check for recentness, we injected a special row with the timestamp of the beginning of the restore test into a dedicated bookkeeping table in the target service. (This table was not accessible or visible to users.) The test configured the PostgreSQL point-in-time Recovery (PITR), setting the parameter restore_target_time
to match that timestamp. When the instance’s restore was completed, the scripts that Patroni ran at the post-bootstrap stage verified whether the row was there.
As a final safeguard, we checked for consistency by verifying that the restored database was internally consistent. In this context, a backup restore was consistent if it produced the same results for a set of queries as the original service it was based on at the point in time when the backup was made.
The easiest way to check for consistency was to read every object in the target database and watch for errors. If the original instance produced no errors for a particular query when the backup was made, the restore of that backup should also produce no errors. We used pg_dump, the built-in tool for producing SQL dumps for PostgreSQL.
Typically, it read every row in the target database and wrote its SQL representation in the dump file. Since we were not interested in the dump, we redirected the output to /dev/null to save disk space and improve performance. (We used the “-s” flag to trigger a schema-only dump without touching data rows.) There was no need to read every data row because we were only interested in checking system catalogs for consistency.
The deployer was responsible for scheduling the tests over the whole fleet. It employed an elegant hack—our favorite type of hack!—to do so by relying on certain Patroni behavior:
Timescale is designed to provide a worry-free experience and a trustworthy environment for your critical data. We believe that developers should never have to worry about the reliability of their database, and they should have complete confidence that their data will never be lost.
Backups provide a facility to archive and store data so that it can be recovered in the future. Of course, backups are only one part of a broader strategy for ensuring reliability. Among other things, Timescale's use of Kubernetes has allowed us to provide a decoupled compute and storage solution for more reliable and cost-effective fault tolerance.
All writes to WAL and data volumes are replicated to multiple physical storage disks for higher durability and availability, and even if a TimescaleDB instance fails (including from hardware failures), Kubernetes can immediately spin up a new container to reconnect to its online storage volumes within tens of seconds without needing to ever take the slower path of recovering from these backups from S3.
So, at Timescale, we modify that ancient proverb: “There are three kinds of database developers: those who do backups, those who will do backups, and those who use Timescale and don’t have to think about them.”
If you’re new to TimescaleDB, create a free Timescale account to get started with a fully managed Timescale service (free for 30 days, no credit card required).
Once you’re using TimescaleDB, or if you’re already up and running, join the Timescale community to share your feedback, ask questions about time-series data (and databases in general), and more. We’d love to hear about your restore tests and thoughts on trade-offs of snapshots vs. point-in-time recovery.
And, if you enjoy working on hard engineering problems, share our mission, and want to join our fully remote, global team, we’re hiring broadly across many roles.