Bulletproofing Your Database With (Multiple) PostgreSQL Replicas

“Don't put all your eggs in one basket,” they say. Besides many other things, this advice applies incredibly well to the world of data. Think about it: bad surprises happen, even on a reliable and robust cloud like the one we operate at Timescale, which is built on the equally steady PostgreSQL foundation. This is why you need PostgreSQL replicas: a secondary database that follows your primary, and that you can rely on if anything goes wrong.

Cosmic rays from the far edges of the universe can flip bits on disks, causing data corruption. A fire or tsunami can destroy a data center, or a tired technician can accidentally spill coffee on the server. You want to protect your data from all vicissitudes of fate. To provide such protection and a worry-free experience, we support high-availability (HA) replicas available in Timescale.

Now, to cover a broader range of use cases and ensure added peace of mind, we’re maximizing PostgreSQL’s capabilities and bulletproofing our customer databases by supporting multiple high-availability typologies. In the Timescale Console, you can now select between high (one async replica) or highest availability (two async replicas) to adjust your service to your needs. The highest availability option comes with two further replication choices: high performance (two async replicas) or high data integrity (one async and one sync replica).

If you came here to read all the technical details, you’re in the right place. We'll walk you through a primer on PostgreSQL replicas before we examine which replication options better suit your use case. We’ll also explain how we bulletproof our users’ databases in the cloud and why we implemented the new HA replica typologies. Finally, we’ll benchmark our replica offerings—measuring transactions per second—so you can choose the solution that best meets your needs.

✨

To see this week’s previous launches and keep track of upcoming ones, check this blog post or our launch page.

Database Replicas in PostgreSQL: The Status Quo

When dealing with database replicas in PostgreSQL, there are two cluster roles: there is always one primary, and there may be one or more replicas. The primary can service writes and reads, while any replicas follow the primary and can only service reads. You can think of it like this:

In case you’re wondering, "hot standby" is a common term for sync and async replicas.

In this case, log shipping can be the following:

Write-ahead logging (WAL) archive (saves and reads files from external storage or Amazon S3)
TCP Connection using the PostgreSQL protocol (streaming replication)
Both (WAL archive, then TCP connections)

With this in mind, let’s see how we use PostgreSQL replication to keep your Timescale Cloud database safe.

How We Bulletproof Your Database in Timescale Cloud

You can take many measures to ensure that if a production database crashes or data corruption occurs, it will be just a harmless cloud in sunny skies instead of a full-blown thunderstorm.

Here’s how we harden your databases on Timescale Cloud:

We provision all your databases using a set of microservices that run in our Cloud. We are constantly improving the way we configure databases and react to failures. After all, we have thousands of databases of all possible sizes to learn from, and we don’t miss any chance to do just that. Whether running a large or a small database, you can rest assured that everything we do is battle-tested on the many different databases running on our cloud.
We take weekly full backups and daily incremental backups. We also ship every WAL file to S3, which allows you to recover your data to any point within the last three days via the Timescale console.
We monitor the PostgreSQL process and restart it if it becomes unresponsive. In addition to restarting databases, we constantly monitor customer database problems, and if a database takes too long to recover, a team member will immediately start looking for the root cause.

Running a production database with a replica is the industry's best practice. We truly believe you should protect any workload serving a mission-critical application with a replica. However, we also understand that not every application is mission-critical, and these may be able to tolerate some minutes—or, in the worst-case scenario, hours—of downtime.

Remember the eggs and baskets analogy? While we allow you to put all your eggs in one basket (we support production services without replicas on Timescale Cloud), you should use your fair judgment and do a thorough risk assessment beforehand.

Enabling High Availability in PostgreSQL

When we added replica support to Timescale Cloud, our primary goal was to ensure high availability. In a previous article, we looked in-depth at high availability in PostgreSQL, including our HA replica design decisions and how we use Patroni. If you're interested in the technical details, check it out. And if you read it, you know we have only supported asynchronous HA replicas until now.

Let’s recap the objectives that drove our decisions when we added HA to Timescale Cloud:

Provide the amount of HA demanded by most time-series use cases: The crucial part of HA while ingesting time-series data is to be able to keep writing no matter what. Even if some data is lost during failover, it's usually not a big problem.
Add almost no performance penalty: The primary keeps shipping WAL to S3 as usual but also streams them to the replica. The streaming process consumes an insignificant amount of resources while simultaneously allowing the replica to act as a hot standby, minimizing the possibility of data loss during failover. At the same time, all writes to the primary are entirely independent of the replica and can be done at maximum speed.
Set the replica’s compute resources to the same values as the primary: This ensures seamless failovers without service degradation while waiting for the instance to scale up.

All of the above goals were possible through asynchronous replicas with streaming replication enabled. When discussing high availability, it’s impossible not to mention the recovery point objective (RPO) and the recovery time objective (RTO). We have you covered on these HA topics in another blog post.

In summary, we are talking about an RTO of less than 10 seconds and an acceptable data loss of up to 16 MB (which is conveniently the size of one WAL file). If the replica delay is higher, the service becomes read-only, and one of our engineers will intervene to verify what’s wrong and fix the problem.

Adding a pinch of salt to the previous two blog posts, we must admit that performing heavy queries on an HA replica isn’t always a good idea. This action can slow down the replica to a point where it increasingly lags behind the primary until it cannot assume the primary role when things take a challenging turn (or when things hit the fan).

You might ask, "How can the lag between a primary and a replica increase?" That’s the nature of asynchronous replication. Lag is something to look out for because switching over to a lagging replica can cause data loss up to the distance behind that replica. Moreover, our configuration disallows switchover to a replica that is delayed by more than 16 MB.

We are not saying that you should not take lag seriously or that it is your problem. We take it seriously, too. When your replica develops a significant lag, our on-call engineer gets a page, and we start doing whatever it takes to reduce the lag. We investigate the cause, and if it's due to heavy reads on the replica, our Support team will contact you to inform you about the issue.

Read Replicas

If you made it this far and read our article on high availability, you know all the bells and whistles of our HA via Patroni. Now, it’s time to explore scenarios where replicas exist without contributing to the clusters’ high availability.

Having a replica the same size as the primary isn’t for everyone. What if your main goal is to take some of the load off the primary to let those data analytics folks run their wildest ideas in SQL—a move that can bring down even the most robust database on Earth? There’s an option for you, too, and it’s called a read replica (our CTO, Mike Freedman, explains all about it in this video).

Like the HA case, read replicas are asynchronous, but unlike the former, they will never be promoted to primary, they just exist to service read queries. Their destiny is to chase the primary continuously, and their absolute desire is zero lag. By default, you can have up to 19 read replicas of varying sizes, plus one primary. If you need more, let us know.

That’s all, folks! Nah, just kidding.

Both HA replicas and read replicas have been available in Timescale Cloud for some time, but we know PostgreSQL has much more to offer. We’ve listened to your feedback—a key driver for feature innovation at Timescale—and decided to expand our replica options.

Multiple PostgreSQL Replicas in Timescale Cloud

In a nutshell, we’re moving from HA to HA, HA, HA. But my friends, there’s nothing to laugh about here (pun intended; I will see myself out).

via GIPHY

In addition to our existing offerings, we now enable users to add two asynchronous replicas or one asynchronous replica and one synchronous replica. To add these new replicas, users should head to our UI's revamped High availability section, which briefly describes the pros and cons of each topology. Let’s go over them.

Highest availability

In the highest availability option, users can add up to two async replicas to their Timescale Cloud instance. If you want the best of both worlds—improve the cluster’s availability and read confidently from the replica without being afraid to compromise the cluster’s resiliency by accidentally overloading the replica and causing excessive lag—you will enjoy having two HA replicas instead of one. Those two replicas are available behind a single URL, which balances the queries between the two.

We previously provided a separate URL for each cluster member, whether a primary, a replica, or a read replica. Load balancing between two replicas allows scaling the reads, and in the event of failure of one of those replicas, the read endpoint remains available for reads at any time. In addition, each replica is in a separate availability zone (AZ), making this topology more resilient in the case of AZ failures.

High data integrity

This is an entirely new replica type for our Cloud. Synchronous replication is essential for customers prioritizing write consistency. Whenever a transaction is written to a primary, we want to be able to read it from a replica. The trade-off, though, is that writes have to be confirmed by the replicas (a network round trip and a disk write). A blunt approach would mean that losing a replica would halt all writes to the cluster until a failed replica rejoined. This means that while you can do reads on synchronous replicas, the best practice is to add a read replica.

Luckily, Patroni has some tricks up its sleeve that allow us to maintain the advantages of sync replicas while retaining higher write availability. Of course, there’s no silver bullet, and some trade-offs must be made. One sync replica is a faux pas, but two of them or a combination of one async and one sync replica is not. Each option has a similar price but ensures a different balance between performance and data consistency.

Using one async replica and one sync replica offers a balance between performance and data integrity. In this setup, all changes are confirmed across the primary and at least one replica node before the transaction is considered complete. This guarantees the highest data consistency and safety, thanks to the sync replica, while the async replica helps increase the cluster resiliency. If we were to lose that single sync replica, the async replica would be promoted to the sync replica status, which would ensure the write consistency while the other replica is recovering to join the cluster in a new async role.

Let’s first clarify our terminology. We already discussed how Patroni and failovers work. It’s important to understand some Patroni configuration parameters before deciding whether to use one or two replicas—and price isn’t the only factor.

All settings mentioned below are described in detail in the Patroni documentation. The setting that enables synchronous replication is called synchronous_mode. If set to ON, synchronous replication is enabled. Another important parameter is called synchronous_mode_strict and controls how the primary should react if replicas fail and can no longer confirm writes. We can’t run a cluster with two nodes and enable strict mode.

In such a case, replica failure means all writes to the cluster are stopped until the replica returns. Here, we meet a trade-off between write availability, price, and consistency. When strict mode is off, write availability on the primary is maximized, but if a replica is offline for a period, it might lag, potentially leading to inconsistency until it catches up. If you require consistent reads and need your replicas always available, consider configuring two HA replicas for your service.

Synchronous vs. Asynchronous Replicas: Choose Your Destiny

Now that all the cards are on the table, it’s time to make an informed decision. Here are some guidelines to help you choose the best replicas for your use case.

Let’s first recap the options we have available:

One async replica: high availability
Two async replicas: highest availability (high performance)
One async and one sync replica: high data integrity

We’ve discussed how each replica type works and the trade-offs of async and sync replicas. To make a fully informed decision, you also need to consider how the type and number of replicas affect your ingest performance.

While we all know you shouldn’t blindly rely on other people’s benchmarks to predict how your unique workload will perform on our service, benchmarks still hold value in providing an apples-to-apples comparison between different configurations. That’s why we also want to be transparent about how we ran them.

We used pgbench, which is well-known in the community, and here’s how we used it:

--client=50 --time=600 --progress=5 --max-tries=50 --failures-detailed --verbose-errors --report-per-command $CONNECTION_STRING

We ran the benchmark in a Timescale instance containing:

2 CPUs
8 GB RAM

The main metric we focused on during the benchmarks was transactions per second (TPS). Running the benchmark 10 times ensured caches warmed up sufficiently for optimal performance. We ran the benchmark for clusters with the following configurations:

One node: no HA replica
Two nodes: primary and an async replica
Three nodes: primary and two async replicas
Two nodes: primary and one sync replica plus an async replica

Here’s the table with benchmarking results:

The performance impact is based on the p90 metric.

In summary, adding replication will impact the total number of transactions processed. When using non-sync replication (the default), the impact is minimal (less than 3 %), but when using synchronous replication, the performance impact is between 10 and 12 %. We hope that knowing the performance penalty that comes with each HA configuration will help you make a better choice for your use case.

Conclusion

At Timescale, we believe that enabling replicas is an industry best practice when developing mission-critical applications. While we already support asynchronous high-availability replicas in our PostgreSQL cloud platform, we’ve decided to expand our offerings to better support the diverse use cases and data protection needs of our users.

Keep in mind that while synchronous replication may impact performance more than asynchronous replication, it provides higher write consistency, offering a stronger safety net in the event of a disaster. We strongly recommend using replicas for all production environments to enhance service availability. To meet your specific performance and consistency requirements, we've developed new replica options, ranging from high to the highest levels of availability.

Start bulletproofing your database today. Create a free Timescale account and choose the best replication option for your use case.