Where Did PostgreSQL Come From?

An elephant in a data center in purple and dark blue hues.

Written by Kirk Laurence Roybal

PostgreSQL is now a fully matured database solution with all the bells and whistles. It is more of a development ecosystem with relationships in the middle than a simple storage and retrieval system.

Let’s talk about our favorite open-source relational database.

PostgreSQL: What’s in a Name?

If you take a look at the name of the project, you’ll find all you need to know. In typical PostgreSQL fashion, you have to know it all first before you can look. Let us explain. No, there is too much. Let us summarize.

Back in the 1970s, there was much hullabaloo about “new math.” New math was just a new way to package the ideas of set theory. Everybody was enamored with it for a while, and children were even taught simple arithmetic by combining sets of apples or sticks (or whatnot) and comparing them to one another. Set theory was all very vogue and hip.

image

At that time, some mathematicians got together and asked themselves (as a mathematician is wont to do) how to create software to answer questions posed in the form of set theory. The result was a language called QUEL and a storage facility called GRE. We will concentrate on the GRE part, which stands for “Graphic Retrieval Engine."

You might wonder what was so graphic about it for the word to appear in the name. Well, remember, we are talking about the 1970s here, and we are still talking about set theory. So “graphic” would be something like a very early entity relationship diagram (ERD). 

That is, the data (sets) could be illustrated as relationships to other data (sets). Aha! We just discovered why the PostgreSQL team still refers to data sets as “relations” (as opposed to tables, rows, or just data).

image

This development occurred mostly as graduate projects at the University of California in Berkeley. At some point, all of our students graduated, and went to commercial companies. One of these was Ingres, a company that wrote a database engine (also called Ingres) that was based on (you guessed it) GRE. “Ingres” means “Interactive Graphics and Retrieval System.” Ingres was successful for a time and spawned several commercial forks, such as Informix (and others).

Along came the 1980s, and with it came the advent of the PC and some ideas about free software distribution. Another implementation of GRE was created by the same developers (mostly), which came after Ingres. Given the wonderful naming conventions of the early authors (who thought partial words were a cute nod to set theory), the project was called “PostGRES.”  That is, the system that came after Ingres, and was also written in GRE.

In parallel with all this hilarity, the original QUEL language was also reintroduced, this time with the moniker “PostQUEL.” We don’t know about you, but we're laughing ourselves silly right now.

However, there was also a new and upcoming challenger to the QUEL language, innovatively called structured query language (SQL). This newcomer seemed more flexible and expandable than QUEL and had a following in other database circles. So, a couple of nerds decided to weld it onto the Postgres project in 1994.

Man, that worked out amazingly well. That language is still with us today, and in 1996 the Postgres project became so committed to it that they welded it onto the name of the product also.

Thus we have the final PostgreSQL project name, which has stuck with us until today. Just be grateful that the name doesn’t also have XML, JSONB, and BTREE tacked on there as well.  There were advocates for that 😉. Ughh.

How Do You Say PostgreSQL?

It depends on when you entered the project. The original project maintainers say “postgrey” or “postgres,” based on the original database or project name. Some say “postgress” (emphasis on ss). These folks showed up from 1994 to 1996 and were probably not core developers.   

After 1996, the official name was pronounced “post-gress-queue-ell.” Even though that pronunciation was supposed to be a disambiguation, it didn’t catch on as well as the community had hoped. Most of the community now says “post-grey sequel,” which is based on the American pronunciation of the portmanteau.

How you say it mostly depends on when you entered the project and which direction you came from when you got here. If you want to sound OG, say “post gress queue ell.” Otherwise, say whatever you want. Just for Deity’s sake, don’t say “post gree.” 

A Few Words About SQL

Structured query language, more famous as SQL, is the language you use to create and work with some databases. At the time that PostgreSQL added the SQL language, it was fairly primitive (both PostgreSQL as well as SQL). Both systems showed a lot of promise, though.  

For most of the 1990s and early 2000s, the PostgreSQL project worked on stability and basic features. The language of SQL expanded as the database capabilities grew, and vice versa.

The Postgres project remained largely true to its university origins and participated in many design discussions for the formation of SQL and databases in general. In fact, that is still the case today. PostgreSQL has huge inroads into the education community and still has quite a bit of input into the request for comments (RFCs) for database-related subjects.

But the point of this side stop in our story is to say that SQL is here to stay for the foreseeable future. Many “simplifications” have come and gone, and the story arch is much the same every time.   

The question is asked, “How can we make this SQL thing simpler?” The answer is some type of declarative syntax that shows the basic relational functions easily. 10. Then, the realities of database complexity show up, and that syntax is expanded: GOTO 10. 

At some point, the “simple” syntax is exactly the same complexity as SQL. This is inevitable because the SQL language evolved at the same rate as the database features. By the time you describe all of the features of the database accurately, you will have a clone of SQL.

At this point, this new syntax is irrelevant and gets thrown in the garbage heap of history with the rest of them.

So, What Is PostgreSQL, Anyway? Huh, huh?

PostgreSQL is a combination of software and data structures that display relationships. That definition, however, is condensed to the point of libel.

PostgreSQL has been worked on for over 30 years (at the time of this writing) by thousands of developers from every walk of life. So, the original definition still stands but belies the complexities that have been introduced over the last three decades.

In addition to the base functionality, it now has replication, foreign data, pluggable languages, and indexes. The SQL language implementation has grown dramatically for three decades, and the number of algorithms to support data display is enormous.

As we said at the top, PostgreSQL is now a fully matured database solution with all the bells and whistles. It is more of a development ecosystem with relationships in the middle than a simple storage and retrieval system.

PostgreSQL benefits: What’s in the box?

Here’s a concise quote from Wikipedia: “PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures.[18] It is designed to handle a range of workloads, from single machines to data warehouses or web services with many concurrent users. It was the default database for macOS Server[19][20][21][22] and is also available for Linux, FreeBSD, OpenBSD, and Windows.”

In other words, everything is in the box. It’s easier to say what isn’t there yet than to enumerate everything that is.

PostgreSQL limitations: So, what’s not in the box?

Multiprocessing of a single query still has a long way to go. Every release of PostgreSQL improves on this feature. This is one of the heaviest contributed areas in the last five or six major versions.

PostgreSQL is still based on a single intake system, otherwise known as vertically scalable.  Sharding and multi-node setups are either third-party or non-existent.

The connection system is process-based, not thread based. It is based on the idea that PostgreSQL will be asked to do a lot of work. Therefore, a process should be laid on a CPU and memory on a corresponding non-uniform memory access (NUMA) processor. Leave it there as long as possible and let it do its thang. This provides for great throughput and miserable visibility.

PostgreSQL does not have very good column data storage. Yes, there are extensions to mitigate that (cstore and hstore), but they are not very popular.

There are, of course, a non-zero amount of things missing or inherently flawed in the PostgreSQL system design. If you want to spend some time cursing, you can read the documentation on implementing multi-version concurrency control (MVCC). Try it as a drinking game. Every time the documentation says, “There may be some overhead there,” take a shot. It makes it a lot more fun.

Where to Go From Here

The history of PostgreSQL and the people involved in the creation of the product are pretty well described in the main PostgreSQL Wikipedia article. Go ahead and follow the links in the first paragraph or two; there are some interesting tidbits. Also, you can read the articles on QUEL query languages.

Also, show up at a conference. Any conference for PostgreSQL, it doesn’t really matter which one. Most innovators are still around and have no problem telling the war stories of how PostgreSQL became the poster child of open-source development and community.   

See you there.

More on PostgreSQL

Getting started? Learn how to fix common errors in PostgreSQL databases:

Want to make the most of PostgreSQL? Read how the relational database’s co-founder, Michael Stonebraker, scales PostgreSQL in his new IoT business venture.