In PostgreSQL and TimescaleDB, you can combine the UPDATE
statement with JOIN
operations to update records in one table based on values from another table. This feature enables database administrators and developers to efficiently modify data across related tables in a single operation, significantly simplifying complex data maintenance tasks.
While standard UPDATE
statements modify records in a single table, UPDATE
with JOIN
allows you to use data from related tables to determine which rows to update and what values to set. This capability is essential for maintaining data consistency and implementing business logic across relational databases.
UPDATE table_name
SET column1 = new_value1,
column2 = new_value2
FROM source_table
WHERE table_name.column = source_table.column;
The PostgreSQL UPDATE
with JOIN
syntax differs slightly from other database systems. Instead of using a JOIN
keyword, PostgreSQL uses a FROM clause to specify the table(s) from which to draw data for the update operation.
Let's look at some examples to better understand UPDATE
with JOIN
. For these examples, we'll use two tables from an e-commerce system: products
and inventory
.
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name TEXT NOT NULL,
price DECIMAL(10, 2),
category TEXT,
last_updated TIMESTAMPTZ,
active BOOLEAN DEFAULT TRUE
);
CREATE TABLE inventory (
inventory_id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES products(product_id),
warehouse_id INTEGER,
quantity INTEGER,
last_count TIMESTAMPTZ
);
Here's our sample data:
-- products table
product_id | product_name | price | category | last_updated | active
-----------+------------------+--------+-------------+----------------------+--------
1 | Basic T-shirt | 19.99 | Clothing | 2023-12-15 09:30:00 | true
2 | Premium Jeans | 59.99 | Clothing | 2023-10-05 14:20:00 | true
3 | Wireless Mouse | 29.99 | Electronics | 2023-11-10 11:45:00 | true
4 | Bluetooth Speaker| 89.99 | Electronics | 2023-12-20 08:15:00 | true
5 | Coffee Mug | 12.99 | Homeware | 2023-09-28 16:50:00 | true
-- inventory table
inventory_id | product_id | warehouse_id | quantity | last_count
-------------+------------+--------------+----------+----------------------
1 | 1 | 100 | 150 | 2024-01-05 09:00:00
2 | 1 | 200 | 75 | 2024-01-06 10:00:00
3 | 2 | 100 | 35 | 2024-01-05 09:15:00
4 | 3 | 200 | 0 | 2024-01-06 10:30:00
5 | 4 | 100 | 20 | 2024-01-05 09:45:00
6 | 5 | 200 | 65 | 2024-01-06 11:00:00
Let's say we want to mark products as inactive if they're out of stock across all warehouses. We can use UPDATE
with JOIN
to identify which products need to be deactivated:
UPDATE products
SET active = false,
last_updated = CURRENT_TIMESTAMP
FROM inventory
WHERE products.product_id = inventory.product_id
AND inventory.quantity = 0;
This query would mark any product as inactive if it has at least one inventory record with zero quantity.
Important note: When using UPDATE with FROM in PostgreSQL, if multiple rows in the joined table (inventory) match a single row in the target table (products), the target row will be updated multiple times. In this example, if a product has multiple inventory records with quantity = 0, the same product will be updated once for each matching inventory row. The final result will be the same in this case, but this behavior can cause unexpected results with more complex updates.
A more accurate approach would be to check if the total quantity across all warehouses is zero. For this, we need a subquery with aggregation:
UPDATE products
SET active = false,
last_updated = CURRENT_TIMESTAMP
WHERE product_id IN (
SELECT product_id
FROM inventory
GROUP BY product_id
HAVING SUM(quantity) = 0
);
Imagine we want to adjust prices based on sales performance by category. We might have a sales
table:
CREATE TABLE sales (
sale_id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES products(product_id),
quantity INTEGER,
sale_date DATE
);
-- Sample data
INSERT INTO sales (product_id, quantity, sale_date)
VALUES
(1, 25, '2024-01-15'),
(1, 30, '2024-01-16'),
(2, 10, '2024-01-15'),
(3, 5, '2024-01-16'),
(4, 8, '2024-01-15'),
(5, 12, '2024-01-16');
Now, let's update product prices based on category performance:
UPDATE products
SET price = price * CASE
WHEN category_performance.total_sales > 50 THEN 1.05 -- 5% increase for high-performing categories
WHEN category_performance.total_sales < 20 THEN 0.95 -- 5% decrease for low-performing categories
ELSE 1.0 -- No change for average performers
END,
last_updated = CURRENT_TIMESTAMP
FROM (
SELECT p.category, SUM(s.quantity) as total_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE s.sale_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY p.category
) as category_performance
WHERE products.category = category_performance.category;
UPDATE with JOIN is invaluable when synchronizing data between systems. For example, updating product information from an external catalog:
CREATE TABLE external_catalog (
product_id INTEGER,
product_name TEXT,
description TEXT,
manufacturer TEXT,
updated_at TIMESTAMPTZ
);
-- Update products with information from external catalog
UPDATE products
SET product_name = ec.product_name,
last_updated = ec.updated_at
FROM external_catalog ec
WHERE products.product_id = ec.product_id
AND ec.updated_at > products.last_updated;
This is a perfect use case for UPDATE
with JOIN
, as we expect a one-to-one relationship between the products and external_catalog
tables based on product_id
.
Combining UPDATE JOIN with CASE statements allows for sophisticated conditional logic:
UPDATE products
SET price = CASE
WHEN i.quantity < 10 THEN products.price * 1.1 -- Increase price by 10% for low stock items
WHEN i.quantity > 100 THEN products.price * 0.9 -- Discount items with high inventory
ELSE products.price -- Keep price the same for normal stock levels
END,
last_updated = CURRENT_TIMESTAMP
FROM (
SELECT product_id, SUM(quantity) as quantity
FROM inventory
GROUP BY product_id
) i
WHERE products.product_id = i.product_id;
Notice that we've used a subquery with aggregation to ensure each product is only updated once, avoiding the multiple-update issue that can occur with one-to-many relationships.
Data consistency management: Keeping related records in sync across multiple tables
Business rule implementation: Applying complex business rules that depend on data from multiple sources
Bulk data maintenance: Efficiently updating large numbers of records based on relational criteria
Data quality control: Correcting or standardizing data based on reference information
System integration: Synchronizing data between different systems or databases
Historical data processing: Updating records based on historical trends or aggregated information
Hierarchical data updates: Propagating changes up or down hierarchical data structures
Compliance and audit trail maintenance: Updating status flags based on compliance criteria
When using UPDATE
with JOIN
operations, keep these considerations in mind:
Multiple matches: With PostgreSQL's FROM
clause, if multiple rows in the source table match a single row in the target table, the target row will be updated once for each matching source row. This is a common source of bugs.
One-to-many relationships: To avoid multiple updates when joining to tables with one-to-many relationships, use a subquery with DISTINCT
or aggregate functions:
UPDATE products
SET active = false,
last_updated = CURRENT_TIMESTAMP
WHERE product_id IN (
SELECT DISTINCT product_id
FROM inventory
WHERE quantity = 0
);
Indexing: Ensure that join columns are properly indexed for optimal performance
Transaction size: Large updates may lock tables for extended periods; consider batching updates
Concurrency: Be aware of how concurrent operations may interact with your updates
RETURNING clause: Use the RETURNING
clause to verify which rows were updated:
UPDATE products
SET active = false,
last_updated = CURRENT_TIMESTAMP
FROM inventory
WHERE products.product_id = inventory.product_id
AND inventory.quantity = 0
RETURNING products.product_id, products.product_name, products.active;
Execution plan analysis: Use EXPLAIN ANALYZE to understand and optimize complex UPDATE JOIN
operations:
EXPLAIN ANALYZE
UPDATE products
SET active = false
FROM inventory
WHERE products.product_id = inventory.product_id
AND inventory.quantity = 0;
Check out PostgreSQL's documentation on data manipulation and query optimization (we have some advice on this topic, too) to learn more about UPDATE with JOIN operations and how to optimize them for your database workloads. To learn more, check out our articles on PostgreSQL operations.
If you're working with large datasets and need high-performance storage with built-in data retention policies, compression, and continuous aggregations, create a free Timescale account today. You can also install TimescaleDB on your machine.