PostgreSQL Materialized Views

Real-time Analytics in Postgres: Why It's Hard (and How to Solve It)

Try for free

Start supercharging your PostgreSQL today.

A complete dashboard with real-time analytics.

Written by Carlota Soto, Mat Arye, and Doug Ortiz

Applications today rely on real-time data analytics to power interactive dashboards, custom reports, and data exploration. Picture your favorite SaaS platform. You’ll most likely want to see both live data and historical trends at a glance. And you’ll need fast access to data for analyzing metrics, building reports, and exploring patterns over time.

This means that to avoid lag, the database powering such applications has to be swift at handling high-throughput data ingestion while maintaining fast query performance on fresh data, including complex analytical queries. While PostgreSQL is tremendously popular among developers, its relational nature wasn't originally built for these workloads. 🫠

But don’t worry: PostgreSQL always has a tool in its ecosystem. In this case, it’s Timescale. While you could use specialized analytics databases, this adds complexity to your stack—Timescale lets you handle both transactions and analytics in your existing PostgreSQL database, eliminating the need for ETL (extract-transform-load) pipelines and multiple systems.

In this blog post, we explain how Timescale’s unique approach builds on PostgreSQL, enabling it to handle the challenges of live data. After reading this article, if you’re curious about this hybrid alternative, you can try Timescale for free.

Hypercore: Hybrid Storage in PostgreSQL for Real-Time Analytics

Real-time analytics require columnar storage for fast query performance on large datasets, but PostgreSQL only provides row-oriented storage by default (we explain the differences between these two database structures here). While row storage excels at transactional workloads, it isn't optimized for analytical queries that need to scan millions of rows.

Timescale's hypercore addresses this by adding a columnar storage engine to PostgreSQL. 🚀This brings key columnstore capabilities essential for real-time analytics:

  • Columnar data organization lets queries read only the columns they need instead of scanning entire rows. When analyzing millions of sensor readings, queries can access just the timestamp and value columns, dramatically reducing I/O compared to row storage.

  • Skip indexes accelerate queries by storing metadata like minimum and maximum values for each data block. For example, when querying orders with an ID > 10,000, the engine can instantly skip blocks where the maximum ID is ≤ 10,000. These indexes work with columnstore data, enabling chunk exclusion that prunes irrelevant data blocks before processing begins.

  • Vectorized query execution uses SIMD (Single Instruction, Multiple Data) to process multiple data points simultaneously. By leveraging modern CPU capabilities to handle operations on multiple values in a single instruction, vectorization dramatically speeds up compression, scanning, filtering, and aggregation operations.

  • Smart compression reduces storage costs significantly since similar column values are stored together. Temperature readings or timestamps, for example, compress much more efficiently in columnar format than when spread across rows.

The result is columnar storage performance for analytical queries while maintaining full PostgreSQL compatibility and transactional semantics in a simplified stack—no need to maintain separate specialized databases for different workloads.

How hypercore works

Hypercore uses row-oriented storage for recent data, ensuring fast inserts and real-time query performance. This means your dashboards stay responsive, your alerts fire instantly, and your applications can efficiently process incoming data. 

As data ages, it automatically moves to columnar storage, enabling efficient compression and fast analytical queries over historical data. This transition is transparent—your queries remain unchanged while gaining the performance benefits of both storage models—and automatic, requiring no action on your end.

image

Plus, hypercore works seamlessly with Timescale's automatic partitioning and materialized views, requiring no additional management overhead. Learn more about hypercore in our dedicated blog post.

Hypertables: Smart Partitioning for Fast Writes and Queries

Hypertables are Timescale's automatically partitioned tables. Working alongside hypercore, they help solve one of the biggest challenges in real-time analytics: handling high-volume data ingestion while maintaining query performance. 

This is what hypertables can deliver to your real-time analytics and other demanding workloads:

  • High-speed ingestion: Write millions of rows per second without performance degradation.

  • Automatic time partitioning: Tables are automatically split into chunks based on time ranges (and optionally space), optimizing both write and read performance.

  • Clever query planning: Queries automatically skip irrelevant time chunks, dramatically speeding up real-time analytics.

  • Parallel query execution: Multiple chunks can be processed simultaneously, accelerating complex analytical queries.

  • Built-in data lifecycle management: Easily archive or delete old data while keeping recent data readily available.

image

For example, when ingesting IoT sensor data or financial transactions, hypertables automatically partition your data into manageable chunks. You can write millions of rows per second while maintaining fast query performance, as queries only scan relevant partitions. 

See how to process billions of rows in PostgreSQL.

We tackled partitioning, let’s talk about fast data aggregations.

Real-Time Analytics: Limitations of PostgreSQL Materialized Views

PostgreSQL is not exactly known for being fast at querying large volumes of data, but it has some tricks up its sleeve. One of the best is materialized views, but they come with impractical limitations for real-time analytics data.

But first things first:

What are materialized views?

image

Materialized views effectively reduce the granularity of large datasets, keeping queries faster

PostgreSQL materialized views pre-compute commonly run queries and store the results as a table. They are a great way to optimize query responses for resource-intensive queries, such as real-time analytics queries that involve processing large volumes of data, aggregations, or multiple joins.

Creating a materialized view is simple: use the CREATE MATERIALIZED VIEW statement and your query of choice. Once you’ve created the materialized view, you can query it as a regular PostgreSQL table:

CREATE MATERIALIZED VIEW customer_orders AS SELECT customer_id, COUNT(*) as total_orders FROM orders GROUP BY customer_id;

-- Query the materialized view SELECT * FROM customer_orders;

Materialized views may be easy to create and query, but there’s a catch.

The challenges of materialized views for real-time data

A materialized view will quickly become stale until you refresh it. Even if you add new data to the base table (or update or delete data), the materialized view doesn’t automatically include those changes—it’s a snapshot of when it was created. To update the materialized view, you need to run a refresh:

REFRESH MATERIALIZED VIEW customer_orders;

So, while PostgreSQL materialized views speed up query response times, they come with limitations that make them impractical for real-time analytics:

1. Inefficient refreshes that recompute the entire dataset

2. No automatic refresh functionality

3. Results become stale between refreshes

Should you discard materialized views if you’re building a SaaS platform from a live dataset, with new data frequently coming in? The answer is no. At Timescale, we built a solution on top of PostgreSQL that effectively enhances materialized views to make them more suitable for today’s applications: continuous aggregates.

Materialized views with automatic refreshes for real-time analytics 

Continuous aggregates solve the limitations of PostgreSQL materialized views for real-time analytics by providing the following:

  • Automatic, incremental refreshes that only process new or changed data

  • Real-time query results that combine materialized and fresh data

  • Simple refresh policies configured within the database

Creating a continuous aggregate is very similar to creating a materialized view (and it can also be queried as a regular PostgreSQL table):

CREATE MATERIALIZED VIEW hourly_sales WITH (timescaledb.continuous) AS SELECT time_bucket(INTERVAL '1 hour', sale_time) as hour, product_id, SUM(units_sold) as total_units_sold FROM sales_data GROUP BY hour, product_id;

But unlike materialized views, creating a refresh policy is straightforward. The following example sets up a refresh policy to update the continuous aggregate every 30 minutes. The end_offset parameter defines the time range of data to be refreshed, and the schedule_interval sets how often the continuous aggregate will be refreshed: -- Setting up a refresh policy SELECT add_continuous_aggregate_policy('hourly_sales', end_offset => INTERVAL '1 minute', schedule_interval => INTERVAL '30 minutes');

You can now keep all your pre-computed queries running seamlessly and delivering always up-to-date results.

Using Enhanced PostgreSQL for Real-Time Analytics: An Example

Let's use a real-world example to see how Timescale’s features equip PostgreSQL for real-time analytics. We’ll look at an IIoT use case of a factory monitoring system that tracks equipment sensors.

Create and query a hypertable

-- 1. HYPERTABLES: Automatically partitioned tables for fast data ingestion CREATE TABLE equipment_metrics (     time            TIMESTAMPTZ     NOT NULL,     equipment_id    INTEGER         NOT NULL,     sensor_type     TEXT           NOT NULL,     reading         DOUBLE PRECISION NOT NULL,     status          TEXT           NOT NULL,     location        TEXT           NOT NULL,     PRIMARY KEY (time, equipment_id) );

-- Convert to hypertable SELECT create_hypertable('equipment_metrics', 'time');

-- Create index for faster filtering CREATE INDEX equipment_status_idx ON equipment_metrics (equipment_id, status, time DESC);

-- Insert sample data INSERT INTO equipment_metrics VALUES      (NOW(), 1, 'temperature', 85.5, 'active', 'zone_a'),     (NOW(), 2, 'pressure', 102.3, 'warning', 'zone_b'),     (NOW(), 3, 'vibration', 0.15, 'active', 'zone_a');

Hypercore

-- 2. HYPERCORE: Enable columnar compression for better storage and query performance -- Compress chunks older than 7 days ALTER TABLE equipment_metrics SET (     timescaledb.compress,     timescaledb.compress_segmentby = 'equipment_id,sensor_type,status,location',     timescaledb.compress_orderby = 'time DESC' );

-- Enable compression policy SELECT add_compression_policy('equipment_metrics', INTERVAL '7 days');

-- Queries automatically use columnar storage for historical data SELECT      equipment_id,     time_bucket('1 hour', time) AS hour,     avg(reading) as avg_reading FROM equipment_metrics WHERE time > NOW() - INTERVAL '30 days' GROUP BY equipment_id, hour ORDER BY hour DESC;

Create a continuous aggregate

-- 3. CONTINUOUS AGGREGATES: Pre-compute common aggregations for fast access -- Create hourly metrics rollup CREATE MATERIALIZED VIEW hourly_equipment_stats WITH (timescaledb.continuous) AS SELECT      time_bucket('1 hour', time) AS bucket,     equipment_id,     sensor_type,     avg(reading) as avg_reading,     min(reading) as min_reading,     max(reading) as max_reading,     count(*) as reading_count FROM equipment_metrics GROUP BY bucket, equipment_id, sensor_type;

-- Add refresh policy to update every hour SELECT add_continuous_aggregate_policy('hourly_equipment_stats',     start_offset => INTERVAL '2 hours',     end_offset => INTERVAL '1 hour',     schedule_interval => INTERVAL '1 hour');

-- Query the continuous aggregate instead of raw data SELECT      bucket,     equipment_id,     avg_reading,     reading_count FROM hourly_equipment_stats WHERE bucket > NOW() - INTERVAL '24 hours'   AND sensor_type = 'temperature' ORDER BY bucket DESC;

-- Real-time query combining recent raw data with pre-computed aggregates SELECT      equipment_id,     sensor_type,     avg_reading,     reading_count FROM hourly_equipment_stats WHERE bucket > NOW() - INTERVAL '7 days'   AND avg_reading > 100  -- Find high readings ORDER BY bucket DESC;

Let’s Recap: Why Choose Timescale for Real-Time Analytics?

If your application relies on real-time analytics for dashboards, alerts, or monitoring, you don't just need fast queries—you need a database that handles transactions and analytics together. Timescale is built on PostgreSQL and gives you everything you need to work with live data. You can ingest, update, and query in seconds, all in one system, so you spend less time managing your stack and more time building features.

By using TimescaleDB, we have significantly improved our data management efficiency, achieving query times of 25 milliseconds, reducing our database size from 200 GB to 55 GB with compression, and maintaining high insertion rates.”  Stanislav Karpov, Head of Data Platform Engineering at Palas

Remember: Real-time OLAP isn't a solution—it's a patch for outdated databases. Today's applications need real-time analytics and transactions in one system. Why maintain multiple databases and complex ETL pipelines when one database can do it all? Timescale has been doing both for years—fast, reliable, and all in PostgreSQL.

If you're running PostgreSQL on your hardware, install the TimescaleDB extension. If you're looking for the benefits of a modern PostgreSQL cloud data platform, sign up for Timescale Cloud (30-day free trial, no credit card required).