Written by Dylan Paulus
Foreign keys in PostgreSQL are a fundamental way of defining relationships between tables. Splitting data across multiple tables is foundational to relational databases and data normalization. However, when spreading related data across various tables, we must ensure that values in one table are valid and exist in another. Enter foreign keys.
In this article, we'll see examples of how to use foreign keys in PostgreSQL, understand what they're used for, and apply different variations of foreign key operations to maintain consistency.
A foreign key "links" two tables together. It does this through a column in a child table, which stores the primary key of a row in a parent table. A foreign key is a constraint we add to the child table column that enforces the primary key to exist in the parent table. This is called referential integrity. By maintaining referential integrity, we keep our databases maintainable. PostgreSQL will throw an error if someone inserts a row into a child table with an id
that doesn't exist in the parent table.
To really get a grasp of foreign keys, let's look at an example through the lens of an online store. We have the following tables to track orders customers have made:
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name TEXT,
address TEXT
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL
);
Right now, there are no foreign keys or links between the customers
and orders
tables. To ensure every order belongs to a customer, we can add a new column and foreign key to an existing table using ALTER TABLE
.
ALTER TABLE orders
ADD COLUMN customer_id INT NOT NULL
CONSTRAINT fk_customers REFERENCES customers(id); -- the foreign key
In this statement, we added a new column to the orders
table called customer_id
, which will hold the primary key (ID) from the customers
table. Next, we tell PostgreSQL to create a constraint named fk_customers
on the customer_id
column to enforce any data in that column is found in the id
columns of the customers
table. This constraint is the foreign key. By convention, we name foreign keys with the prefix fk_[name]
to differentiate foreign keys from other constraints and indexes.
Now add customer and order data:
INSERT INTO customers (name, address) VALUES ('Dylan', '1234 E Street');
INSERT INTO customers (name, address) VALUES ('Karisa', '345 Oak Lane');
INSERT INTO customers (name, address) VALUES ('Nora', '1111 Red Blvd');
INSERT INTO orders (price, customer_id) VALUES (3.50, 1);
INSERT INTO orders (price, customer_id) VALUES (8.75, 3);
INSERT INTO orders (price, customer_id) VALUES (1.20, 1);
We can observe the power of foreign keys by inserting a new row into the order
table using a customer_id
that does not exist in the customers
table.
INSERT INTO orders (price, customer_id) VALUES (1.53, 40);
Our foreign key works! Orders must have a valid customer associated with them. If an invalid customer_id
is given, then PostgreSQL will prevent the insert from running.
As we've seen, foreign keys allow us to split tables apart to normalize data and maintain relationships. They help keep referential integrity by guaranteeing that the primary key stored in the child table is, in fact, an actual entry in the parent table. But this isn't everything. Additionally, we can set up foreign key operations to modify all related rows simultaneously.
In the orders
to customers
example, we first created both tables and then added a foreign key from orders
to customers
using customer_id
. This is an example of adding a foreign key *after* a table is created, but what if we know the parent and child tables while creating a table? We can add a constraint during table creation:
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name TEXT,
address TEXT
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int CONSTRAINT fk_customers REFERENCES customers(id)
);
When defining a foreign key, we have several options at our disposal. For instance, we don't necessarily need to assign a name to every foreign key; PostgreSQL can generate a name for us. Let's consider the process of removing the CONSTRAINT fk_customers
:
Additionally, PostgreSQL will assume that foreign keys will reference the primary key of a parent table, so we can omit (id)
when creating the foreign key.
What happens when a parent table deletes a row that a child table relies on through a foreign key? Using the example above, what happens to customers' orders if a customer is deleted? Let's try it.
DELETE FROM customers WHERE id=1;
By default, PostgreSQL employs the NO ACTION
operation on any foreign keys. This means that if a child table's row still references a parent table, PostgreSQL will not allow the deletion of the parent row, throwing an error instead. However, we can work around this by first deleting any order associated with customer.id=1
, and then we will be allowed to delete the customer.
NO ACTION
prevents deleting a parent row until all associated child rows are deleted. Additionally, NO ACTION
prevents someone from updating the primary key of the parent row while it has relations.
Foreign key operations allow us to define what happens to child rows if a parent is updated or deleted. This is done by adding ON UPDATE
or ON DELETE
to the constraint.
For example, If we wanted to define the NO ACTION
operation on delete explicitly, we create our table like so:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON DELETE NO ACTION
);
Of course, PostgreSQL gives us different options for how associated rows behave when a parent row is removed through foreign key operations.
RESTRICT
works exactly the same as NO ACTION
, except the constraint check gets deferred until the end of the query transaction. This can be useful for complex scripts, but generally, NO ACTION
will be used instead.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON DELETE RESTRICT
);
SET NULL
will replace the value of the child column with NULL
when the parent is deleted. This can be useful when we want to track child references even after deleting a parent.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON DELETE SET NULL
);
When using SET NULL
with ON UPDATE
, all orders associated with them will be set to NULL
whenever the customer's primary key changes.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON UPDATE SET NULL
);
CASCADE
will remove all associated rows when a parent table row is deleted. From our example, if a customer is deleted, all their orders will also be deleted. Using CASCADE
is a great way to keep your database clean.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON DELETE CASCADE
);
Paired with ON UPDATE
, CASCADE
will update all child rows with the new primary key when the parent row is updated.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int REFERENCES customers ON UPDATE CASCADE
);
SET DEFAULT
, like SET NULL
, will update the child row when the parent row is removed, except it sets the value to the column's default value.
Say we have a system
customer to whom all orders go when a customer is deleted. By first setting customer_id
to a default value and using SET DEFAULT
, we can ensure that all orders are automatically assigned to the system
customer.
INSERT INTO customers (id, name, address) VALUES (0, 'system', '127.0.0.1');
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int DEFAULT 0 REFERENCES customers ON DELETE SET DEFAULT -- Assign to customer with id 0
);
Similarly, when using the ON UPDATE
operation, the child rows will be set to the default value when the parent row is updated.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
price DECIMAL,
customer_id int DEFAULT 0 REFERENCES customers ON UPDATE SET DEFAULT
);
Foreign keys are a powerful feature in PostgreSQL that enables us to maintain data integrity and establish meaningful relationships between tables. By enforcing referential integrity, foreign keys ensure that our database remains consistent and reliable even as data evolves.
If you have a large PostgreSQL database with a massive number of tables linked via foreign keys and want to make the move to Timescale, as of TimescaleDB 2.16.0, we support foreign keys between regular PostgreSQL tables and hypertables. Hypertables work like regular PostgreSQL tables but automatically partition your data, speeding up your queries and ingests. This performance boost will sustain as your tables' volume keeps growing, making hypertables extremely scalable.
Hypertables are at the core of TimescaleDB, along with other native features, such as a powerful hybrid row-columnar storage engine that allows you to reduce your storage footprint by 90-95 % and always up-to-date materialized views to rev up use cases like real-time analytics.
If this piqued your interest, remember: you can now convert any PostgreSQL table with a foreign key into a hypertable without having to worry about your schema. Easy as that.
Create a free Timescale account and try it for yourself today.
Learn more about optimizing and maintaining PostgreSQL with these guides: