Understanding FILTER in PostgreSQL (With Examples)

Try for free

Start supercharging your PostgreSQL today.

An upside-down purple triangle over a black background.

Written by Team Timescale

In PostgreSQL and TimescaleDB, the FILTER clause can be used to extend aggregate functions like sum(), avg(), and count() by adding a WHERE clause. This is especially useful when you want to perform multiple aggregations in your query.

When FILTER is used with an aggregate function, only the input rows that its WHERE clause evaluates to be true will be used in the aggregate clause. The data being processed by the aggregate function is “filtered” by the WHERE clause’s condition.

Syntax:

<aggregate_function>(<expression>) FILTER(WHERE <condition>)

If you use an aggregate function with a window function call, here is the syntax:

<aggregate_function>(<expression>) FILTER(WHERE <condition>) OVER(<window_definition)

A FILTER clause can only be used with window functions that are aggregates.

Examples

Let’s look at some examples to better understand FILTER clauses. For the first set of examples, we’ll use the data in this table called car_sales.

sales_year

sales_month

make

model

kind

quantity

revenue

2021

1

Ford

F100

PickUp

40

2500000

2021

1

Ford

Mustang

Car

9

1010000

2021

1

Renault

Fuego

Car

20

9000000

2021

2

Renault

Fuego

Car

50

23000000

2021

2

Ford

F100

PickUp

20

1200000

2021

2

Ford

Mustang

Car

10

1050000

2021

3

Renault

Megane

Car

50

20000000

2021

3

Renault

Koleos

Car

15

1004000

2021

3

Ford

Mustang

Car

20

2080000

2021

4

Renault

Megane

Car

50

20000000

2021

4

Renault

Koleos

Car

15

1004000

2021

4

Ford

Mustang

Car

25

2520000

Replacing a CASE Statement With FILTER

PostgreSQL didn’t have FILTER until version 9.4. Before this version, database developers often used a CASE statement and WHEN clauses to get the results you can now get with FILTER. This method was much less straightforward.

Let’s start with a simple query to get the minimum and maximum revenue per car maker. Here is the query to get the results using the FILTER clause:

SELECT min(revenue) FILTER (WHERE make = ‘Ford’) min_ford, max(revenue) FILTER (WHERE make = ‘Ford’) max_ford, min(revenue) FILTER (WHERE make = ‘Renault’) min_renault, max(revenue) FILTER (WHERE make = ‘Renault’) max_renault FROM car_sales;

We created an alias for each of these sums with the AS keyword so that we can differentiate the results, or else each column in the result set would be labeled simply min or max. Here are the results:

min_ford

max_ford

min_renault

max_renault

1010000

2520000

1004000

23000000

To get the same results with a CASE statement, you would have to use this query:

SELECT min(CASE WHEN make = ‘Ford’ THEN revenue ELSE null END) min_ford, max(CASE WHEN make = ‘Ford’ THEN revenue ELSE null END) max_ford, min(CASE WHEN make = ‘Renault’ THEN revenue ELSE null END) min_renault, max(CASE WHEN make = ‘Renault’ THEN revenue ELSE null END) max_renault FROM car_sales;

For cases like this, FILTER is easier to understand.

Pivoting Tables With FILTER

We can use the FILTER clause to pivot rows into tables, which you often need to do to generate reports. Let’s say we want to know the total quantity of cars sold in each month. We could use GROUP BY to do that with a query like this:

SELECT sales_month, sum(quantity) FROM car_sales GROUP BY sales_month;

These are the results:

sales_month

sum

4

90

3

85

1

69

2

80

Or we could use FILTER to pivot these results with a query like this:

SELECT sum(quantity) FILTER (WHERE sales_month = 1) jan_quantity, sum(quantity) FILTER (WHERE sales_month = 2) feb_quantity, sum(quantity) FILTER (WHERE sales_month = 3) mar_quantity, sum(quantity) FILTER (WHERE sales_month = 4) apr_quantity FROM car_sales;

Results:

jan_quantity

feb_quantity

mar_quantity

apr_quantity

69

80

85

90

Using FILTER Clauses With Window Functions

For the next example, we are going to use a different data set that contains temperature and precipitation data from two cities.

day

city

temperature

precipitation

2021-09-04

Miami

68.36

0.00

2021-09-05

Miami

72.50

0.00

2021-09-01

Miami

65.30

0.28

2021-09-02

Miami

64.40

0.79

2021-09-03

Miami

71.60

0.47

2021-09-04

Atlanta

67.28

0.00

2021-09-05

Atlanta

70.80

0.00

2021-09-01

Atlanta

63.14

0.20

2021-09-02

Atlanta

62.60

0.59

2021-09-03

Atlanta

62.60

0.39

In the next query, we are going to get the three-day moving average of the temperature in each city. To do this, we will define a window using the OVER clause and partition it by the city. To show how FILTER works with the window clause, we will only return the maximum temperature when the temperature for any day in a window, including the current row plus the two rows before is over 70 degrees. In other words, it will give us the highest temperature of the last three days, inclusive of any time the temperature for the day is over 70.

Here is the query:

SELECT city, day, temperature,     MAX(temperature)     FILTER (WHERE temperature > 70)     OVER (       PARTITION BY city       ORDER BY day ASC       ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) max_3day FROM city_data ORDER BY city, day;

And here are the results:

city

day

temperature

max_3day

Atlanta

2021-09-01

63.14

Atlanta

2021-09-02

62.60

Atlanta

2021-09-03

62.60

Atlanta

2021-09-04

67.28

Atlanta

2021-09-05

70.80

70.80

Miami

2021-09-01

65.30

Miami

2021-09-02

64.40

Miami

2021-09-03

71.60

71.60

Miami

2021-09-04

68.36

71.60

Miami

2021-09-05

72.50

72.50

Next Steps

To learn more about the FILTER clause and how to use it in PostgreSQL, you can check out PostgreSQL’s documentation on aggregate expressions and window function calls.

If you want to experiment with FILTER in a PostgreSQL (but faster) database while enjoying the benefits of automatic data partitioning and columnar compression to speed up performance and reduce storage, create a free Timescale account today.