Start supercharging your PostgreSQL today.
Written by Dylan Paulus
PostgreSQL databases are the heart of any application, and the data held within is the lifeline of any organization. A security breach can spell disaster by eroding trust, leaking personal information, and inflicting irreparable damage. In this article, we'll learn why security is necessary, the different steps in securing PostgreSQL, and how these security concepts transfer to Timescale.
Databases store an application's most valuable asset: data. This data includes personal information like names, passwords, and street addresses, but it could also include company secrets. Having an insecure database is devastating for a few reasons:
Leaking personal data erodes trust from users and customers
Falling share prices or loss of revenue
You can be hit with regulatory penalties (e.g., GDPR)
In specific sectors, vulnerabilities can be used to cause physical harm
In September 2017, Equifax, a US-based credit reporting agency, announced that 147 million people had their personal information leaked, including names, addresses, social security numbers, and driver's license numbers. The result of the breach includes a loss of trust in Equifax, danger for each person whose data was leaked, a settlement, and a $1.4 billion cleanup cost. Though not directly caused through PostgreSQL, the Equifax situation shows the gravity and importance of good security practices.
Security researchers and hackers play a cat-and-mouse game. Hackers try to exploit a system, and researchers find solutions to prevent exploitation. Though the low-level details may change to catch up with this cat-and-mouse game, there are timeless best practices to secure PostgreSQL that can prevent most attacks.
We'll cover five central components to securing a PostgreSQL database. These include:
Network security
Access control
Encryption
Failover and backups
Auditing and logging
A solid foundation of every component will strengthen your PostgreSQL's security posture.
Network security is the first line of defense against unauthorized access to your database. A server that blocks all network connections is much more secure than a publicly accessible server. However, a database that cannot accept connections to it is useless. Every PostgreSQL needs an exposed network. The goal of securing the network is to limit and lock down how and where someone can access PostgreSQL to only the essentials.
pg_hba.conf
is a configuration file used by PostgreSQL to control client authentication and access to any database (HBA stands for host-based authentication). Think of it like configuring a firewall. Each record in pg_hba.conf
defines a connection type, an IP address range a client can connect from, a database the client can connect to, a username, etc. This allows us to be super fine-grained in which IP addresses we allow connections from and how connections can be made.
In tandem with pg_hba.conf
, postgresql.conf
provides further options to lock down the network to PostgreSQL:
listen_addresses
controls which network interfaces PostgreSQL listens on for connections. By default, this is set to localhost
, which restricts access to the local machine. Specifying specific IP addresses can limit access only to trusted hosts. This selective listening further minimizes the potential attack surface.
max_connections
limits the number of concurrent connections to help prevent unauthorized access attempts. Ensuring the database is not overwhelmed with excessive connection requests mitigates the risk of denial-of-service (DoS) attacks.
When we think of database security, the first thing that may come to your mind is access control: creating user accounts and limiting that access through permissions. In PostgreSQL, we have roles
and privileges
. A role
can be considered the same as a user or group. Our database can have a programmer
role, which all programmers use to access PostgreSQL. Or, we can have a dylan
role that I can only access to query data. Roles then get assigned privileges
that provide or limit functionality to the database.
To create a new role, we can use the CREATE ROLE
expression:
CREATE ROLE analyst LOGIN;
This creates an analyst
role that can be used to log into the database. The role does not have a password, but we can create/require one by including PASSWORD [password]
.
CREATE ROLE analyst LOGIN PASSWORD 'unicorncat';
Wait, isn't there CREATE USER
? Digging through the PostgreSQL docs, you'll find SQL to create a user conveniently called CREATE USER
. This creates a role with the LOGIN
privilege assigned to it, allowing the role to log in to a database. However, it is just a convenience wrapper around CREATE ROLE
and does not actually create a user.
RBAC, or Role-Based Access Control, is important in reducing the attack surface area of a bad actor, disgruntled employee, or even a well-meaning employee who makes an accident. A common practice in cybersecurity is using the Principle of Least Privilege. In other words, we only give roles the most basic privileges they need to do their job—nothing more. This prevents a data analyst from accidentally deleting rows or a hacked developer account from moving volume mounts.
To add privileges to roles, use the GRANT
command on any database object (table, column, view, etc). For example, we can give our analyst
role only SELECT
permissions on a single table by running:
GRANT SELECT ON [my_table] TO analyst;
Alternatively, a shortcut to give a role all permissions is to use ALL PRIVILEGES
:
GRANT ALL PRIVILEGES ON [view_name] TO analyst;
There is a multitude of privileges we can assign to roles to fine-tune access and enforce the Principle of Least Privilege.
Encryption encodes data so that it can only be deciphered and understood by authorized individuals. Think of encryption as putting your secret documents in a lockbox. With a key, you have access to the documents. Sharing the key with a friend also gives them access to the documents. But, if a thief steals the lockbox, they won't be able to read your secret documents without your key. You may have lost your documents (and lockbox), but at least the thief won't have access to your important documents! Data encryption is an important layer and an added safety measure in securing databases.
You'll generally hear about two categories of encrypting data:
Encryption in transit: obfuscating data moving between two systems (e.g., your application to PostgreSQL)
Encryption at rest: obfuscating data sitting on a hard drive (e.g., data inside your PostgreSQL database)
SSL (Secure Sockets Layer), now predominantly succeeded by TLS (Transport Layer Security), is a critical defense mechanism against data breaches by encrypting the data transmitted between clients and the PostgreSQL server. This encryption is vital for thwarting man-in-the-middle attacks, where an attacker could intercept and potentially alter or steal the data in transit. By implementing SSL/TLS, PostgreSQL ensures that all data exchanges are encrypted, rendering any intercepted communication unreadable and useless to eavesdroppers. Enforcing SSL for all connections enhances data security as it moves across the network and reinforces the integrity and trustworthiness of the database environment.
To set up SSL, connect to your PostgreSQL instance through a terminal (SSH) or open a terminal instance on the server. We will need OpenSSL—you can check if it's installed by running openssl -v
(If it's not installed, check out the OpenSSL website).
Navigate to PostgreSQL's /data
directory, which contains the postgresql.conf
file (In Homebrew this is located under /usr/local/var/postgresql@14
, and will vary depending on how you installed PostgreSQL and which operating system you use).
We will need to create three certificates: server.key
, which is the private key; server.crt
, which is the server certificate; and root.crt
, which is the trusted root certificate.
Create the server.key
and server.crt
certificates by running (since we're self-signing the certificate, we can do this in one command):
openssl req -nodes -new -x509 -keyout server.key -out server.crt
You'll be prompted for information that gets baked into the certificate. Next, we'll create the root.crt
certificate by copying the server.crt
certificate:
cp server.crt root.crt
For added security, and because PostgreSQL requires lowered permissions on server.key
, run chmod to assign server.key
read-only permissions:
chmod 400 server.key
Finally, we must tell PostgreSQL to enable SSL and where to find the certs. To do this, open postgresql.conf
—found in the same directory where we created the certificates. Uncomment the following parameters:
ssl = on
ssl_ca_file = 'root.crt'
ssl_cert_file = 'server.crt'
ssl_crl_file = ''
ssl_key_file = 'server.key'
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
ssl_prefer_server_ciphers = on
Then open pg_hba.conf
, and update the section under IPv4 local connections
to enable SSL by changing this:
host all all 0.0.0.0/0 md5
to
hostssl all all 0.0.0.0/0 md5
Now restart PostgreSQL and connect to it. If SSL is set up correctly, you will see output confirming that we are connected using SSL. Congrats!
We often want to encrypt the actual data stored in PostgreSQL. For example, our database might be great at controlling access and locking down access to personal tables, such as authentication tables that store usernames and passwords. Admins should never have access to individual user passwords stored in plain text (what if someone is looking over the shoulder of the DBA?). To fix this problem, we should encrypt the data stored in sensitive columns.
PostgreSQL has the pgcrypto
module, which provides encryption and hashing functions. This allows us to encrypt the data stored in columns. For example, pgcrypto
exposes the crypt()
and gen_salt()
functions. Together, these two functions allow us to salt and encrypt user passwords natively in PostgreSQL by running INSERT INTO accounts (password) VALUES (crypt('[user secret password]', gen_salt('md5')));
.
To learn more, read our article on how to take advantage of pgcrypto
!
Failovers are replicas or standby PostgreSQL instances that can handle traffic if the primary database fails. During the failover process, your primary database experiences a failure or crash. An exact copy of the primary database, the replica, gets assigned any new database connections taking the place of the primary database until it becomes healthy again.
Implementing failover replicas in PostgreSQL, although not natively supported as a built-in feature, is crucial for ensuring high availability, data redundancy, and system resilience during malicious attacks.
Though PostgreSQL doesn't have a built-in failover feature, the tools exist to create it. The main tools that enable failover in PostgreSQL are streaming replication and WAL archiving. Streaming replication mirrors updates from the primary database to the replica databases in real time, ensuring that any replica instance is in parity with the primary database at any moment.
WAL Archiving is a safeguard to recover any transaction not yet streamed to replicas at the point of failure. Streaming replication and WAL archiving provide a resilient way to create replicas for failovers. However, we must still detect when a failure occurs and route traffic from the primary database to a replica.
Tools like repmgr, pg_auto_failover, and Patroni address this by automating the failover process by monitoring database health, detecting failures, and switching to a replica database. Of course, Timescale provides automatic failover support through high-availability replicas.
Incorporating failovers into your PostgreSQL setup enhances its resilience against operational failures and is an added defense against malicious intrusions, making your data infrastructure more robust and secure.
Having a database backup is important for recovery modes and can be a massive asset in tricky security incidents. If a bad actor were to get access to your PostgreSQL server, they could lock everyone out of being able to connect to the server and demand money for you to re-access the server, called a ransomware attack. A ransomware attack is terrible. However, if we had database backups on a separate server, we could rebuild the database on a new server to reduce downtime for our customers.
PostgreSQL provides both logical and physical processes to back up your databases.
Logical backups contain SQL statements used to recreate your database. pg_dump
is a CLI utility that generates all the SQL commands needed to recreate your database. Run pg_dump [name_of_your_database] > backup.sql
from a PostgreSQL server terminal. At any point, we can run psql [name_of_your_database] < backup.sql
to restore the database from a backup. Logical backups provide granular control over what is backed up and are highly reliable, but running SQL statements to recreate a large database can be slow.
Physical backups involve cloning PostgreSQL's data files on the server's filesystem. Since the files are the database, these point-in-time backups quickly restore. PostgreSQL comes with pg_basebackup
, a CLI utility that performs physical backups of the physical data and WAL.
More information about backing up PostgreSQL can be found here.
Securing backups and having processes to review backups are just as important as having backups. Store backups offsite outside your PostgreSQL server to prevent data loss if you ever lose access to PostgreSQL. Encrypt the backups to prevent unauthorized access—backups still hold sensitive data! Additionally, spend time testing and restoring backups. This ensures that your backups are valid and familiarizes you with the process when a disaster occurs.
Auditing is a challenging security component to implement because it is not a technological change but a human process. Periodic check-ins should be established to review logs for irregularities, reevaluate access control permissions, and find opportunities to improve security practices. How often you run audits is entirely up to you. However, running them at a regular cadence is in your best interest to keep up with the changing security landscape.
Part of the auditing process should be reviewing PostgreSQL's logs. Many times, we are only concerned with crash logs and error reports. However, logs also give a timeline of PostgreSQL's actions and database changes made by roles. Reviewing logs thoroughly will reveal any suspicious behavior. Review your postgresql.conf
file's settings to ensure logging is enabled, logs are backed up, and PostgreSQL logs the information needed to monitor suspicious behavior.
A few recommended changes to postgresql.conf
:
logging_collector = on
Turns on the log collector, which captures stderr output and writes it to a file in a separate daemon—making sure logs don't block PostgreSQL's regular database operation
log_line_prefix = "%m [%p] user=%u db=%d: "
By default, PostgreSQL will prefix logs with %m [%p]
, the timestamp (%m), and the process ID (%p), but we can change what PostgreSQL provides to increase visibility on who is doing what.
log_connections = on
Enables PostgreSQL to log connections and attempted connections. Defaults to off
. Enabling log_connections
will give insight into bad actors trying to brute force into your PostgreSQL database.
Outside of PostgreSQL proper, tools like pgAudit provide additional functionality for auditing database access and usage.
Timescale has the five security components built-in without manually configuring PostgreSQL yourself. For example, all connections to a Timescale database are encrypted through SSL, and all data volumes are encrypted, so your database is already encrypted at rest and in transit. Tasks that require a significant maintenance effort, like creating backups, are handled automatically without your intervention. Backups through Timescale are made weekly, with incremental backups performed daily. Timescale adds additional layers of security through multi-factor authentication and 24/7 support and monitoring.
Security is vital in managing PostgreSQL, as data is a company's most important resource. Strengthening the five components of PostgreSQL security can help you pass audits, retain customers, and prevent fines. Luckily, Timescale makes security easy and even provides additional security measures.
Try Timescale for free today by visiting https://console.cloud.timescale.com/signup.