Jun 20, 2024
PostgreSQL has a long-standing reputation for having a miserable upgrade process. So, when the community heartily recommends that you should upgrade as soon as possible to the latest and greatest PostgreSQL version, it's not really surprising that your heart sinks, your mouth goes dry, and the outright dread of another laborious job takes over.
It's almost like finishing a long hike or trying to convince somebody that Betamax was better than VHS. Eventually, you just want it to be over so you can take a nap. There's not even any joy about all the new features and speed. It's just too exhausting to generate emotion anymore.
This blog post will hopefully serve as a guide for when to pull off the old band-aid. That is, when you should upgrade and what PostgreSQL version you should select as a target. By the end of this post, we will introduce you to our best practices for upgrading your PostgreSQL version in Timescale, so you can get over this process of upgrading as quickly and safely as possible.
The PostgreSQL Global Development Group has simplified the upgrade process quite a bit with more explicit version numbering. Since there are only two external stimuli, there are only two choices: upgrade the binaries (minor version change) or upgrade the data on disk (major version change).
The developers of PostgreSQL never really had a plan in mind for when and how to upgrade. This seems a bit of a harsh statement when tools like pg_upgrade exist but bear with me. These tools were meant to make upgrades possible, not to imply any particular schedule or recommendations for an upgrade plan. The actual upgrade implementation was always left as an exercise for the administrator.
Let's start with some of the community's conventional wisdom and pretend that those ideas were actually a plan of sorts.
This "plan" is based on the fear of existing bugs. It is a very Rumsfeldian plan that assumes you don't know what the bugs are, but you're certainly better off if they're fixed. This makes for a very aggressive upgrade pace and hopes for a better tomorrow rather than a stable today.
The complete opposite fear-based pseudo-plan is to stick to the existing version—come hell or high water—unless you run into an otherwise unfixable bug that affects your installation. This is based on the idea that the bugs we know are better than the bugs we don't know. Unfortunately, it ignores the bugs you don't even know exist.
This is the general recommendation of the PostgreSQL Global Development Group. The general idea is that all software has bugs, and upgrading is better than not upgrading. That is a bit over-optimistic about new bugs being introduced and kind of ignores that new features that you don’t care about have to be configured—or else.
This comes a bit closer to planning than guessing for minor versions, as the minor versions of PostgreSQL do not change the file system; they only change the binaries. These upgrades tend to be super heavy on bug fixes and very low on new features, which is where bugs tend to get introduced. It doesn't say anything about bugs you have actually encountered, nor does it say anything about any improvements from which you might be able to benefit.
Probably the most dangerous plan since you will never have more time in the future and will probably never upgrade. Experience says that this is a completely silly plan that never gets implemented.
Okay, this makes some kind of sense. Unfortunately, it ignores the rest of your installation and puts the application development team into tailspin mode for your DevOps enjoyment. It is the kind of policy you end up with when the DevOps team doesn’t really care about the Apps team.
Much of this guide is based on personal experience with PostgreSQL upgrades over the years. In some cases, the old was better than the new, and in others, the other way around. In some cases, the fixes worked immediately. In others, well, not so much.
Very few hard and fast rules can be drawn when coming up with a plan of this nature, but I'll try to bring the experience to bear in a way that helps to make a decision in the future. That being said, this is a "best practice" based on experience, not a "sure-fire thing."
As a way to reduce the amount of just sheer subjectivity and opinion around choosing the moment to upgrade, I've taken a look through the release notes of PostgreSQL. In this lookie-look, I've attempted to note where bug fixes occurred and mentally move them back to the version where they were discovered. Unfortunately, this task is also somewhat subjective, as I was not a part of the bug fix development or the bug discovery. So these are just educated guesses, but I hope rather good ones.
Then I looked at the mental list that I had made and thought about whether it matched my personal experience with successful versus unsuccessful upgrades. It (again) seemed a subjectively good indicator of when an upgrade succeeded or failed.
So, on to the findings.
The first thing I noticed in my research is that the biggest upgrade failures were with a new major version containing updates to the write-ahead log (WAL). These were most notable for versions 10 and 12.
Version 10 would make a book by itself. It was a major undertaking, with quite a few subsystem rewrites. In these version upgrades, there were numerous additions to items (like WAL for hash indexes), as well as improvements and changes to the background writer to support structural changes on disk. These major updates introduced the largest number of unintended behaviors, which lasted the longest before being detected and fixed.
The next most striking failures came from logical replication between 10 and 11. Of course, logical replication was invented for 10, so there had never been an attempt to use it for production upgrades before. This first use in the field was—how should I put it?—interesting.
After that, the bugs died down a lot but were never quite gone.
Here is my list of questions to ask before an upgrade.
1. How big is the change? Was it a major refactor, and did it involve any of the following?
2. Were there any huge performance gains?
3. Does it include major security fixes?
4. Are there major built-in function() improvements/enhancements?
5. Do all of my extensions exist for the new version?
These are my rules of thumb for whether a new PostgreSQL version is compelling for upgrade. Unfortunately, this still requires some subjective evaluation and a bit of professional knowledge. For instance, just because vacuum is a major feature, it doesn't mean it has ever been a problem with an upgrade. It could be, though, and we should look at its major changes with a bit of a wry mouth hold.
This brings me to my personal procedure that has (so far) followed the above guidelines.
CONCURRENTLY
, SYSTEM
, and performance. Things not to upgrade for: functions(), operators, and libraries.That's all there is to it.
I hope this blog post has helped you to make a decision for when PostgreSQL has compelling new features for you.
Of course, this is only a general rule of thumb. If you feel compelled to upgrade for some other reason, don't let my guide tell you what not to do. It only intends to help in the absence of any other stimuli for upgrade. You do you.
So now you have followed the checklist above and determined that it’s time for you to upgrade your PostgreSQL version. If you’re running a production database, this may be easier said than done, especially if we are talking about upgrading your major version (e.g., from PostgreSQL 13 to PostgreSQL 14):
Practical example: if you are upgrading from PostgreSQL 13 to 14, in PostgreSQL 14, the factorial operators ! and !! are no longer supported, nor is running the factorial function on negative numbers. What may seem like a silly example is, in fact, illustrative that assumptions made about how certain functions (or even operators) work between versions may break once you update.
Fortunately, PostgreSQL is awesome enough to provide clear Release Notes stating the changes between versions. But this doesn’t solve our problem: how to upgrade production databases safely?
This is one of the many areas in which choosing a cloud database will help. If you are self-hosting your mission-critical PostgreSQL database and want to run a major upgrade, you would have first to create a copy of your database manually, dumping your production data and restoring it in another database with the same config as your production database.
Then, you would have to upgrade this database and run your testing there. This process can take a while depending on your database's size (and if we’re talking about a time-series application, it’s probably pretty big).
Timescale makes the upgrading process way more approachable. Timescale is a database cloud for time-series applications built on TimescaleDB and PostgreSQL. In other words, this is PostgreSQL under the hood—with a sprinkle of TimescaleDB as the time-series secret sauce.
Timescale databases (which are called “services”) run on a particular version of TimescaleDB and PostgreSQL:
But what about upgrades between major versions of PostgreSQL? Since these are often not backward compatible, we cannot automatically upgrade your service in Timescale from, let’s say, PostgreSQL 13 to 14, which may introduce problems in your code and cause major issues!
Also, upgrading between major versions of PostgreSQL can (unfortunately but unavoidably) introduce some downtime. If you are running a mission-critical application, you want complete control over when that unavoidable downtime will occur. And you certainly want to test that upgrade first.
A database platform like Timescale can certainly help solve this issue. Upgrading your major version of Postgres will always be a decent lift—but a hosted database platform can make this process way smoother, helping you automate what can be automated and also facilitating your testing:
pg_stat_statements
—on the fork to ensure they don’t contain any breaking changes to the new major version. Let’s explore this more in the next section. If you’re not using Timescale, you can create a free account here—you’ll have free access for 30 days, no credit card required.
Here’s how you can safely upgrade your Timescale service:
Let’s see how this looks in the console.
First, check which TimescaleDB and PostgreSQL version your database is running on your service Overview page.
To fork your service is as easy as going to the Operations tab and clicking on the Fork service option. This will automatically create an exact snapshot of your database.
To upgrade your major version of PostgreSQL, go to your Maintenance tab. Under Service upgrades, you will see a Service upgrades button. If you click that button, your service will be updated to the next major version of Postgres (in the example below, the service would be upgraded from PostgreSQL 13.7 to PostgreSQL 14).
That’s it! You can now use the latest and greatest that PostgreSQL has to offer. That said, choosing to upgrade is no small feat. Before going through the upgrade process, there is a lot to consider, and it is important to have a plan to account for the downtime you will experience.
While the upgrade process can be a bit painful, you can at least rely on Timescale to handle the technical orchestration of the upgrade. In the future, we hope to offer even better tooling to make the upgrade process entirely pain-free (but we have to walk before we can run, right?).
If you’d like to see what Timescale has to offer, start a free trial if you haven’t already. There’s no credit card required!