May 10, 2023
Posted by
Chris Travers
September’s commitfest is over with 65 committed patches, 40 patches returned with feedback, 177 patches moved to the next commitfest, 3 rejected, and 11 withdrawn. From a PostgreSQL developer's perspective and beyond, the patches include a large number of improvements in a large number of areas.
In this new blog post series pilot, I have selected a few patches that I find particularly interesting and helpful, and which I feel I can easily communicate their importance to a general audience of PostgreSQL users. This is by no means a comprehensive list of committed patches of interest, and in particular, patches that improve code quality or set the foundations for new features somewhere in the distant future are not included in this review. However, in my discussion with PostgreSQL developers, the improvements that I heard a lot about involved type-safety improvements not on this list.
In this article I have selected five patches, three of which are committed and two returned with feedback, for discussion. I will focus on their utility to database users and application developers from my PostgreSQL developer point of view (POV).
Let’s start with the committed patches.
PostgreSQL manages memory by lifetime and allocates based on either a “chunk allocator” or a “slab allocator” (the latter being designed specifically for logical replication contexts, and not relevant to this patch).
Allocation sets are arranged in a hierarchy relating to memory lifetime within the software. This prevents one from having to free memory at a defined point later, as the system can just do this later, at a defined point in time. For example, if we allocate memory with a lifetime related to the processing of a row, then the memory will be reused or freed when PostgreSQL moves on to process the next row. If we allocate to the lifetime of the transaction, then the memory is freed when the transaction commits or rolls back. It also means that in some cases, PostgreSQL can just reuse a chunk of memory without having to do significant processing of it other than the header.
The chunk allocator also increases the memory allocation on each subsequent call for an allocation set. The first chunk in the allocation set is 8 kB in size, and each subsequent chunk allocated doubles until one reaches 1 GiB. A given allocation within a set cannot span chunks, and this is why you cannot allocate (compressed or not) more than 1 GB of data within PostgreSQL in C using the palloc memory allocation interfaces.
This patch provides a number of important memory improvements in this regard. Although PostgreSQL memory management is well-optimized performance-wise and avoids most of the problems plaguing C programmers generally, this introduces further improvements.
Databases are environments where memory allocation efficiency matters for a number of reasons. In addition to pure memory savings, the fact that these may allow more data to fit in a given chunk means fewer malloc() calls will happen, and this is likely to produce performance improvements as well.
This patch does not affect all aspects of memory in PostgreSQL. While data coming into shared buffers or from shared buffers back to disk are not affected by this change, the data allocated for processing data extracted from shared buffers (or processed for writing to shared buffers) is affected.
This also shows how even very well-optimized systems can sometimes still have opportunities for further improvements, and it shows that even in mature systems, there are often gains to be made by those with the knowledge and insight to find them.
PostgreSQL’s logical replication is built on a publisher/subscriber model, where a publisher exposes a series of changes that a subscriber ingests and writes locally. In current releases of PostgreSQL, it is not really possible to have a loop of subscribers and publishers. Replication must always be only unidirectional.
For the most part, this works well because bidirectional replication poses fundamental (mathematical) conflict resolution problems that are impossible to solve where authoritative data is required. For example, if the same row is updated on two different mutually replicating systems, there is insufficient information to determine what the final output state is or should be. For this reason, cyclic replication topologies—often called “multi-master replication”—are generally frowned upon.
A common approach to addressing this problem is to use “last update wins” as a strategy, but this approach necessarily clobbers existing updates. In my previous work in other companies, where I was using such complex replication topologies with other databases, we actually had to take steps to prevent conflicting updates elsewhere in the infrastructure for this reason.
This patch allows, for the first time, logical replication loops. In other words, while a subscriber can already republish data it subscribed to, the original publisher could now, with new options set on the subscription, subscribe to data that could be republished from its own subscriptions. This can be done while preventing replication loops by setting an “origin” option in the subscription. The patch is then intended to prevent replication write loops, where the same insert or update is replicated back and forth forever.
Logical replication is a completely different beast when compared to physical streaming replication. It has completely different use cases and pitfalls than the latter. It imposes very different administrative burdens as well. However, this is a massive leap forward towards a community-owned bidirectional logical replication capability, which will likely open some doors for PostgreSQL where replication topologies based on cyclic graphs (often called multi-master replication) are actually worth the significant costs.
ORDER BY
/ DISTINCT
PostgreSQL supports various aggregates where collation operations are important, such as ntile, percentile, and other ordered set aggregates, window functions, and aggregates with the DISTINCT
modifier. The efficiency of these operations is affected by the sort ordering of the table scans, and in current versions of PostgreSQL, the planner does not take this into account. As a result, ORDER BY
and DISTINCT
in aggregate functions can lead to unnecessary sorts, which lead to slower query performance.
Having proper planner support for these sorts of operations is a significant performance win for anyone using these sorts of aggregates. In my experience, there are many users of these sorts of features, especially those whose workflows include both transactional and decision support workflows. This patch represents another significant improvement for analytic workloads on PostgreSQL.
Now let’s address the two patches that haven’t been yet committed.
Not every interesting patch that caught my attention here got committed this time around. Two important patches will get discussed further as they progress through peer review and the commitfest feedback process. Both of these patches are currently listed as “returned with feedback” but are of sufficient importance or near enough to completion that they are worth watching anyway.
Kerberos, the authentication protocol that was developed by MIT, and available on many platforms before being incorporated into Microsoft’s ActiveDirectory, has the capacity to pass delegated credentials between hosts. PostgreSQL can already accept delegated credentials for authentication but currently cannot delegate credentials.
So, for example, when a user accesses an internal ASP.net application on a company intranet, the web server can authenticate the user via Kerberos, and then pass on a delegated credential to the database if needed for actual access. This approach allows fine-grained and redundant control over access to data with a great deal of defense in depth.
What PostgreSQL is not currently able to do is delegate Kerberos authentication, which means that Kerberos authentication cannot be used between PostgreSQL nodes over things like foreign data wrappers.
A proposed patch would ensure that PostgreSQL could delegate Kerberos credentials to libpq connections, allowing this to be used by the PostgreSQL foreign data wrapper, dblink, and similar extensions. While this patch has been listed as “returned with feedback,” the general sense is that this would be a really useful feature, and so I would read that status at present as a note that the current patch has some problems that need to be rethought before it can be accepted.
The fundamental difficulty here is that Kerberos session encryption does not provide forward privacy or forward security, and therefore, when session encryption is used with credential delegation, the user could potentially break the encryption on the middle host. In the event that passwords are also used for a foreign data wrapper connection, this would render some protections against less robust sensitive data or even in some cases (for example, where passwords are used in a passthrough way to authenticate against a third-party provider) allow for password disclosure.
The overall consensus is that this is a feature that would be very helpful in PostgreSQL, something I believe as well. The patch rejection is also a great testament to the degree of peer review in the community that goes into security-critical code paths. I expect that sooner or later, this feature will be included with appropriate safety measures in place, and I hope it is resubmitted sooner rather than later.
In PostgreSQL, roles (which include users) are global to an instance of PostgreSQL (called a “cluster” in PostgreSQL terminology). This means that if you have several databases managed by the same PostgreSQL instance, the roles and role memberships are common throughout all databases.
One major difficulty in writing multi-tenant applications which use databases as the tenant boundary is that if you assign a user with a role in one database, this occurs in all databases. For example, if you have an accounting application and you have a user “chris” who has different permissions in three different databases on the same server, you cannot use a simple, consistent set of roles to manage database permissions.
One solution to this problem (which we did when I was building out such a system in LedgerSMB) is to create different roles on each database, including, for example, the database name in the role name. This leads to a certain degree of complexity and makes the role names harder to read. Another option would be to limit users to a single database. This adds complexity in a different place.
A proposed patch would allow the ability to grant role permissions only on a specific database. For multi-tenant applications, this would be a game changer. The patch was near acceptance when there was some further discussion on documentation. It needed to be rebased and corrected, and the initial submitter did not reply in a timely manner. This can happen—people can become busy or unavailable for one reason or another, and patches can end up temporarily orphaned.
Given the general utility of this patch for multi-tenant applications, I would like to see this fixed and resubmitted sooner rather than later.
The patches discussed here represent small but useful steps forward for PostgreSQL. These and many others make PostgreSQL a database that is improving significantly with each major release.
Commitfests such as these provide great insight into this process, the review that patches undergo, and how PostgreSQL keeps moving forward.
That’s one of my favorite things about PostgreSQL: it’s a database that is significantly improving with each major release. If you want to add even more functionality to PostgreSQL, explore TimescaleDB—it extends PostgreSQL with things like automatic time-based partitioning and indexing, continuous aggregations, columnar compression, and time-series functionality. And if you’re using a managed service for PostgreSQL, try Timescale—it’s free for 30 days, no credit card required.