Time-Series Database: An Explainer

Three developers looking at several data visualizations powered by a time-series database

Posted by

Attila Toth

People create, capture, and consume more data than ever before. A large portion of this data volume is timestamped, a.k.a. time-series data.

So it’s not surprising that the time-series database (TSDB) category has seen a lot of growth in popularity in the past five years.

In this post, we’ll introduce these databases and explain how they can work for you.

What Is a Time-Series Database?

A time-series database is a type of database specifically designed for handling time-stamped or time-series data. Time-series data are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time.

Time-series data includes server metrics, application performance monitoring data, network data, sensor data, and more.

Financial/crypto data is a perfect example of time-series data

Since time-series data is time-centric, recent, and normally append-only, a time-series database (TSDB) leverages these foundational characteristics to store time-series data more simply and efficiently than general databases.

The traditional time-series management cycle

How Time-Series Databases Work

Time-series databases are optimized to handle data that arrives in time order. They are particularly designed to store, query, and analyze data points with a timestamp associated with them.

Time at the core

In traditional relational databases, time is often treated as just another column, but in a time-series database, time is fundamental. Every piece of data is associated with a timestamp, which enables the database to handle the special characteristics of time-series data—such as:

High write throughput: Time-series data is often generated continuously and at a high rate. Think of sensor data from IoT devices, log data from servers, or stock market prices. TSDBs optimize for large volumes of sequential writes, making it easy to ingest high-throughput streams without bottlenecks.
Efficient queries on time range: Since most queries on time-series data involve ranges of time (e.g., “show me the past 5 minutes of server CPU usage” or “analyze temperature trends over the past year”), TSDBs are optimized for these kinds of queries. Time-range queries are faster and more efficient compared to relational databases, where querying over time might involve scanning many unnecessary rows.
Data retention and compression: Time-series data grows quickly, but not all historical data has the same value over time. TSDBs typically include data retention policies, automatically expiring or downsampling older data (for example, keeping only daily averages after one year). They also apply compression algorithms, which significantly reduce storage requirements.

Time-series databases vs. traditional databases

You might ask: Why can’t I just use a “normal” (i.e., non-time-series) database?

Well, you can, and some people do. However, once tables start getting big, relational databases like PostgreSQL get too slow for time-series data.

Time-series data accumulates very quickly, and relational databases are not designed to handle that scale (at least not in an automated way). Traditionally, relational databases fare poorly with vast datasets, and NoSQL databases are hailed as the best performers at scale.

But the truth is that, with some engineering fine-tuning, PostgreSQL can be turned into a time-series database—actually performing much better than other specialized solutions (as we've shown in benchmarks versus InfluxDB and MongoDB).

TSDBs—whether relational or NoSQL-based—introduce efficiencies that are only possible when the time element is treated as a first-class citizen.

These efficiencies allow them to offer massive scale, from performance improvements, including higher ingest rates and faster queries at scale (although some support more queries than others) to better data compression.

Is My Data Right for a Time-Series Database?

Even if you don't often refer to your data as time-series data, they might be. A few examples of specific time-series data types:

Sensor data
Transaction data / Financial transactions / Customer transactions / Order history
Operational analytics / Application data
Fleet data /Logistics
Metrics data
Tick data / Fintech data / Trading data
Event data
Vector data
Weather data
Insurance data
Call records

If you want to know if your data is time-series data, this is the litmus test: does your data have some kind of timestamp or time element related to it, even if that may not be its main dimension? If the answer is “yes,” you’re dealing with time-series data.

Time-Series Database Use Cases

Example 1: Monitoring systems

Let’s say you’re monitoring servers for CPU and memory usage. Here’s how you might create a hypertable for storing these metrics in TimescaleDB, insert data, and run a query to analyze recent trends.

Create a hypertable for storing time-series data:

CREATE TABLE server_metrics (
    time        TIMESTAMPTZ       NOT NULL,
    server_id   TEXT              NOT NULL,
    cpu_usage   DOUBLE PRECISION  NOT NULL,
    memory_usage DOUBLE PRECISION NOT NULL
);

-- Convert the table into a hypertable
SELECT create_hypertable('server_metrics', 'time');

Insert data from various servers:

INSERT INTO server_metrics (time, server_id, cpu_usage, memory_usage)
VALUES
  (NOW() - INTERVAL '5 minutes', 'server1', 0.85, 0.75),
  (NOW() - INTERVAL '4 minutes', 'server2', 0.70, 0.65),
  (NOW() - INTERVAL '3 minutes', 'server1', 0.90, 0.78),
  (NOW() - INTERVAL '2 minutes', 'server3', 0.95, 0.80),
  (NOW() - INTERVAL '1 minute', 'server2', 0.72, 0.68);

Query recent data for CPU usage trends:

SELECT time, server_id, cpu_usage
FROM server_metrics
WHERE time > NOW() - INTERVAL '10 minutes'
ORDER BY time DESC;

This query quickly retrieves the most recent CPU usage trends, taking advantage of TimescaleDB’s time-range optimization.

Aggregate data over a time range to detect long-term trends or anomalies:

SELECT server_id,
       time_bucket('1 minute', time) AS bucket,
       AVG(cpu_usage) AS avg_cpu_usage
FROM server_metrics
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY server_id, bucket
ORDER BY bucket DESC;

This allows you to bucket and analyze CPU usage trends across servers over the past hour, giving you valuable insights into performance.

Example 2: Financial data and analytics

For financial use cases, time-series databases are vital for tracking stock prices or currency rates with millisecond precision. Here's how you might use TimescaleDB to store and analyze stock prices:

Create a hypertable to store stock prices:

CREATE TABLE stock_prices (
    time       TIMESTAMPTZ       NOT NULL,
    stock_id   TEXT              NOT NULL,
    price      NUMERIC(10, 2)    NOT NULL
);

-- Convert the table into a hypertable
SELECT create_hypertable('stock_prices', 'time');

Insert real-time stock price data:

INSERT INTO stock_prices (time, stock_id, price)
VALUES
  (NOW() - INTERVAL '5 minutes', 'AAPL', 172.50),
  (NOW() - INTERVAL '4 minutes', 'GOOGL', 2850.30),
  (NOW() - INTERVAL '3 minutes', 'AAPL', 173.00),
  (NOW() - INTERVAL '2 minutes', 'GOOGL', 2848.90),
  (NOW() - INTERVAL '1 minute', 'AAPL', 173.25);

Query stock prices over a specific time range:

SELECT time, stock_id, price
FROM stock_prices
WHERE time > NOW() - INTERVAL '10 minutes'
ORDER BY time DESC;

Analyze price trends by time-bucketing the data:

SELECT stock_id,
       time_bucket('1 minute', time) AS bucket,
       AVG(price) AS avg_price
FROM stock_prices
WHERE time > NOW() - INTERVAL '1 day'
GROUP BY stock_id, bucket
ORDER BY bucket DESC;

This query aggregates the average stock price in one-minute intervals over the past day, providing valuable insights for financial analysis.

Why Businesses Depend on Time-Series Databases

There are several reasons for the growth of time-series databases:

The increasing number of IoT devices and applications has increased the need for databases that can store time-series data efficiently. IoT often generates high-granularity and high-volume datasets that general-purpose databases can’t handle very well.
Most companies are investing more in data and analytics. As organizations seek to gain insights from both their historical and real-time data, the demand for time-series databases has increased.
The rise of observability and the growing importance of monitoring your application at all times have also contributed to their popularity since this kind of data is also time-series data.

How does a time-series database help?

A time-series database helps you worry less about your database infrastructure, so you can spend more time building your application or gaining insight from the data.

You can make it simpler or faster (or both) to ingest and query data by optimizing the database and implementing shortcuts. Here are some examples.

Improved data ingestion performance

We typically generate time-series data at a high granularity. Going back to the previous finance example, there might be hundreds of data points generated every second or even more (depending on how many symbols you monitor). High volumes of data can be challenging to ingest efficiently. Time-series databases are equipped with internal optimizations like auto-partitioning and indexing, allowing you to scale up your ingestion rate.

Simplified querying

When you query time-series data, you are often interested in how the data have been changing in the past five minutes, five days, five weeks, etc. You are also more likely to want to perform time-based calculations like time-based aggregations (time bucketing), moving averages, time-weighted averages, percentiles, and so on. A time-series database has specialized features to simplify and speed up the calculation of typical time-series queries.

time	symbol	price
2023-02-06 15:05:26	PFE	44.19
2023-02-06 15:05:25	WMT	141.27
2023-02-06 15:05:24	KO	59.67
2023-02-06 15:05:24	TSLA	194.08
2023-02-06 15:05:24	SNAP	10.88

Store real-time and historical data in one place

The value of time-series data comes from the fact that you can compare the most current state of your system with past situations. A time-series database has the tools and scale needed to store both historical (archives) and real-time data in one data store, making it easy for you to keep an eye on what’s happening right now while also having the ability to learn from the past. This gives you the power to simplify your data infrastructure and seamlessly analyze all your data in one place.

Automated data management

Time-series databases can also automate your time-based data management tasks. For example, you might want to get rid of all data that is older than one year to save disk space and because you don’t need old data anymore. Or you still need to keep old data around, but maybe it’s not as handy as more recent data, you can set up your database to automatically compress old data to save on storage costs.

There might be other valuable automations available. For example, in TimescaleDB you can use continuous aggregates to incrementally add data to a predefined materialized view—improving query performance and developer productivity.

Top Time-Series Databases

According to DB-Engines, here are some of the top time-series databases:

The following table compares the main features of some of these databases. For a more detailed explanation and comparison of each of these TSDBs, be sure to check our article on the best time-series databases.

	Database Model	Scalability	Deployment	Query Language	Pricing Models
TimescaleDB	Relational database	Vertically scalable, with automatic partitioning, columnar compression, optimized queries, and automatically updated materialized views	Self-managed or managed cloud service	SQL	TimescaleDB open-source can be self-hosted and is free Timescale for AWS follows a pay-as-you-go model
InfluxDB	Custom, non-relational NoSQL, columnar database	Horizontally scalable	Self-managed or managed cloud service	SQL, InfluxQL, Flux	InfluxDB Open Source is free to use, and InfluxDB Cloud Serverless is a managed service with a pay-as-you-go pricing model. InfluxDB Cloud Dedicated and InfluxDB Clustered are for high-volume production workloads with costs depending on storage, CPU, and RAM
Prometheus	Pull-based model that scrapes metrics from targets	Scalable vertically or through federation	Deployed as single binary on server or on container platforms such as Kubernetes	PromQL	Open-source: no associated license cost
Kdb+	Columnar database with a custom data model	Horizontally scalable with multi-node support and multi-threading	On-premises, in the cloud, or hybrid	Q language	Free 32-bit version for non-commercial purposes For commercial deployments, pricing depends on the deployment model & number of cores
Graphite	Whisper (file-based time series) database format	Horizontally scalable, supports replication and clustering	On-premises or in the cloud	GQL	Open-source: no associated licensing cost
ClickHouse	Columnar database	Horizontally scalable	On-premises or in the cloud Also available as a managed service	SQL-based declarative query language, mostly identical to ANSI SQL standard	ClickHouse is open-source and doesn’t have an associated license cost ClickHouse Cloud follows a pay-as-you-go pricing model
MongoDB	No-SQL database with a JSON-like document model	Horizontally scalable - supports automatic load balancing, data sharding, and replication	Self-managed or managed cloud service	MQL (MongoDB Query Language)	MongoDB Community Edition is open-source and free to use MongoDB Enterprise: pricing depends on the features you choose MongoDB Atlas has a pay-as-you-go pricing model

Tips for Choosing a Time-Series Database

Once your applications start storing time-series data, you still have to pick a TSDB that best fits your data model, write/read pattern, and developer skill sets.

When evaluating database options, consider these factors:

Scalability. You must ensure that the database can scale vertically (adding more resources to a database node) and horizontally (adding more database nodes to your system) while remaining performant and reliable.
Maintainability. Consider the time and effort it will take to maintain it long-term—backups, replicas, data retention, archiving, automation, etc. Consider the maintenance jobs you want to do and see if the database has the tools and features to help you.
Reliability. You might notice that many companies in this space are developing brand-new technologies from scratch. As time-series data quickly becomes the basis of business decisions and forecasts, you need to be sure that it will be available when you need it.
Query language. There are also quite a few time-series databases on the market with custom languages (e.g., Flux by InfluxDB). Many developers dislike this because they need to learn a new language just to use the database. And even though some of them try to look like SQL, they are not real SQL.

Get Started With Timescale

Timescale has remained focused on product stability from day one—whether you're dealing with time series or vector data. PostgreSQL is and will remain the rock-solid foundation upon which we are building a solid cloud platform for your evolving data needs.

Here are some of TimescaleDB's superpowers:

Built on PostgreSQL: Timescale leverages the reliability and robustness of PostgreSQL, a proven and widely used relational database. We believe you should use Postgres for Everything, and we're taking that lesson to heart in our product.
Focus on stability: Timescale maintains a stable design, reducing technical debt and ensuring reliability. Hypertables, continuous aggregates, and compression are our core features, and we have continued to build on them to provide a stable PostgreSQL platform in the cloud.
SQL compatibility: Timescale uses standard SQL, making it easier for developers to adopt and integrate into existing systems.
Allow developers to scale PostgreSQL: Timescale's focused approach is on helping developers stay with PostgreSQL as they scale.

If you're running your PostgreSQL database on your own hardware, you can simply add the TimescaleDB extension. If you prefer to try our mature PostgreSQL cloud in AWS, create a free Timescale Cloud account today. It only takes a couple of seconds, no credit card required.

Time-Series Database Resources

If you are interested in how other developers are using TimescaleDB for time series and analytics, check out our developer stories:

If you want to keep reading about TimescaleDB, these articles will help you learn more:

Originally posted

Feb 10, 2023

Last updated

Oct 21, 2024

Posted by

Attila Toth

Originally posted

Feb 10, 2023

Last updated

Oct 21, 2024

Posted by

Attila Toth