Time-Series Database: An Explainer
People create, capture, and consume more data than ever before. A large portion of this data volume is timestamped, a.k.a. time-series data.
So it’s not surprising that the time-series database (TSDB) category has seen a lot of growth in popularity in the past five years.
In this post, we’ll introduce these databases and explain how they can work for you.
- What is a time-series database?
- Is my data time-series data?
- Why businesses depend on time-series databases
- Popular time-series databases
- How do you choose one?
- Further resources
- FAQs
What Is a Time-Series Database?
A time-series database is a type of database specifically designed for handling time-stamped or time-series data. Time-series data are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time.
Time-series data includes server metrics, application performance monitoring data, network data, sensor data, and more.
Since time-series data is time-centric, recent, and normally append-only, a time-series database (TSDB) leverages these foundational characteristics to store time-series data more simply and efficiently than general databases.
How Time-Series Databases Work
Time-series databases are optimized to handle data that arrives in time order. They are particularly designed to store, query, and analyze data points with a timestamp associated with them.
Time at the core
In traditional relational databases, time is often treated as just another column, but in a time-series database, time is fundamental. Every piece of data is associated with a timestamp, which enables the database to handle the special characteristics of time-series data—such as:
- High write throughput: Time-series data is often generated continuously and at a high rate. Think of sensor data from IoT devices, log data from servers, or stock market prices. TSDBs optimize for large volumes of sequential writes, making it easy to ingest high-throughput streams without bottlenecks.
- Efficient queries on time range: Since most queries on time-series data involve ranges of time (e.g., “show me the past 5 minutes of server CPU usage” or “analyze temperature trends over the past year”), TSDBs are optimized for these kinds of queries. Time-range queries are faster and more efficient compared to relational databases, where querying over time might involve scanning many unnecessary rows.
- Data retention and compression: Time-series data grows quickly, but not all historical data has the same value over time. TSDBs typically include data retention policies, automatically expiring or downsampling older data (for example, keeping only daily averages after one year). They also apply compression algorithms, which significantly reduce storage requirements.
Time-series databases vs. traditional databases
You might ask: Why can’t I just use a “normal” (i.e., non-time-series) database?
Well, you can, and some people do. However, once tables start getting big, relational databases like PostgreSQL get too slow for time-series data.
Time-series data accumulates very quickly, and relational databases are not designed to handle that scale (at least not in an automated way). Traditionally, relational databases fare poorly with vast datasets, and NoSQL databases are hailed as the best performers at scale.
But the truth is that, with some engineering fine-tuning, PostgreSQL can be turned into a time-series database—actually performing much better than other specialized solutions (as we've shown in benchmarks versus InfluxDB and MongoDB).
TSDBs—whether relational or NoSQL-based—introduce efficiencies that are only possible when the time element is treated as a first-class citizen.
These efficiencies allow them to offer massive scale, from performance improvements, including higher ingest rates and faster queries at scale (although some support more queries than others) to better data compression.
Is My Data Right for a Time-Series Database?
Even if you don't often refer to your data as time-series data, they might be. A few examples of specific time-series data types:
- Sensor data
- Transaction data / Financial transactions / Customer transactions / Order history
- Operational analytics / Application data
- Fleet data /Logistics
- Metrics data
- Tick data / Fintech data / Trading data
- Event data
- Vector data
- Weather data
- Insurance data
- Call records
If you want to know if your data is time-series data, this is the litmus test: does your data have some kind of timestamp or time element related to it, even if that may not be its main dimension? If the answer is “yes,” you’re dealing with time-series data.
Time-Series Database Use Cases
Example 1: Monitoring systems
Let’s say you’re monitoring servers for CPU and memory usage. Here’s how you might create a hypertable for storing these metrics in TimescaleDB, insert data, and run a query to analyze recent trends.
- Create a hypertable for storing time-series data:
CREATE TABLE server_metrics (
time TIMESTAMPTZ NOT NULL,
server_id TEXT NOT NULL,
cpu_usage DOUBLE PRECISION NOT NULL,
memory_usage DOUBLE PRECISION NOT NULL
);
-- Convert the table into a hypertable
SELECT create_hypertable('server_metrics', 'time');
- Insert data from various servers:
INSERT INTO server_metrics (time, server_id, cpu_usage, memory_usage)
VALUES
(NOW() - INTERVAL '5 minutes', 'server1', 0.85, 0.75),
(NOW() - INTERVAL '4 minutes', 'server2', 0.70, 0.65),
(NOW() - INTERVAL '3 minutes', 'server1', 0.90, 0.78),
(NOW() - INTERVAL '2 minutes', 'server3', 0.95, 0.80),
(NOW() - INTERVAL '1 minute', 'server2', 0.72, 0.68);
- Query recent data for CPU usage trends:
SELECT time, server_id, cpu_usage
FROM server_metrics
WHERE time > NOW() - INTERVAL '10 minutes'
ORDER BY time DESC;
This query quickly retrieves the most recent CPU usage trends, taking advantage of TimescaleDB’s time-range optimization.
- Aggregate data over a time range to detect long-term trends or anomalies:
SELECT server_id,
time_bucket('1 minute', time) AS bucket,
AVG(cpu_usage) AS avg_cpu_usage
FROM server_metrics
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY server_id, bucket
ORDER BY bucket DESC;
This allows you to bucket and analyze CPU usage trends across servers over the past hour, giving you valuable insights into performance.
Example 2: Financial data and analytics
For financial use cases, time-series databases are vital for tracking stock prices or currency rates with millisecond precision. Here's how you might use TimescaleDB to store and analyze stock prices:
- Create a hypertable to store stock prices:
CREATE TABLE stock_prices (
time TIMESTAMPTZ NOT NULL,
stock_id TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL
);
-- Convert the table into a hypertable
SELECT create_hypertable('stock_prices', 'time');
- Insert real-time stock price data:
INSERT INTO stock_prices (time, stock_id, price)
VALUES
(NOW() - INTERVAL '5 minutes', 'AAPL', 172.50),
(NOW() - INTERVAL '4 minutes', 'GOOGL', 2850.30),
(NOW() - INTERVAL '3 minutes', 'AAPL', 173.00),
(NOW() - INTERVAL '2 minutes', 'GOOGL', 2848.90),
(NOW() - INTERVAL '1 minute', 'AAPL', 173.25);
- Query stock prices over a specific time range:
SELECT time, stock_id, price
FROM stock_prices
WHERE time > NOW() - INTERVAL '10 minutes'
ORDER BY time DESC;
- Analyze price trends by time-bucketing the data:
SELECT stock_id,
time_bucket('1 minute', time) AS bucket,
AVG(price) AS avg_price
FROM stock_prices
WHERE time > NOW() - INTERVAL '1 day'
GROUP BY stock_id, bucket
ORDER BY bucket DESC;
This query aggregates the average stock price in one-minute intervals over the past day, providing valuable insights for financial analysis.
Why Businesses Depend on Time-Series Databases
There are several reasons for the growth of time-series databases:
- The increasing number of IoT devices and applications has increased the need for databases that can store time-series data efficiently. IoT often generates high-granularity and high-volume datasets that general-purpose databases can’t handle very well.
- Most companies are investing more in data and analytics. As organizations seek to gain insights from both their historical and real-time data, the demand for time-series databases has increased.
- The rise of observability and the growing importance of monitoring your application at all times have also contributed to their popularity since this kind of data is also time-series data.
How does a time-series database help?
A time-series database helps you worry less about your database infrastructure, so you can spend more time building your application or gaining insight from the data.
You can make it simpler or faster (or both) to ingest and query data by optimizing the database and implementing shortcuts. Here are some examples.
Improved data ingestion performance
We typically generate time-series data at a high granularity. Going back to the previous finance example, there might be hundreds of data points generated every second or even more (depending on how many symbols you monitor). High volumes of data can be challenging to ingest efficiently. Time-series databases are equipped with internal optimizations like auto-partitioning and indexing, allowing you to scale up your ingestion rate.
Simplified querying
When you query time-series data, you are often interested in how the data have been changing in the past five minutes, five days, five weeks, etc. You are also more likely to want to perform time-based calculations like time-based aggregations (time bucketing), moving averages, time-weighted averages, percentiles, and so on. A time-series database has specialized features to simplify and speed up the calculation of typical time-series queries.
Store real-time and historical data in one place
The value of time-series data comes from the fact that you can compare the most current state of your system with past situations. A time-series database has the tools and scale needed to store both historical (archives) and real-time data in one data store, making it easy for you to keep an eye on what’s happening right now while also having the ability to learn from the past. This gives you the power to simplify your data infrastructure and seamlessly analyze all your data in one place.
Automated data management
Time-series databases can also automate your time-based data management tasks. For example, you might want to get rid of all data that is older than one year to save disk space and because you don’t need old data anymore. Or you still need to keep old data around, but maybe it’s not as handy as more recent data, you can set up your database to automatically compress old data to save on storage costs.
There might be other valuable automations available. For example, in TimescaleDB you can use continuous aggregates to incrementally add data to a predefined materialized view—improving query performance and developer productivity.
Top Time-Series Databases
According to DB-Engines, here are some of the top time-series databases:
The following table compares the main features of some of these databases. For a more detailed explanation and comparison of each of these TSDBs, be sure to check our article on the best time-series databases.
Tips for Choosing a Time-Series Database
Once your applications start storing time-series data, you still have to pick a TSDB that best fits your data model, write/read pattern, and developer skill sets.
When evaluating database options, consider these factors:
- Scalability. You must ensure that the database can scale vertically (adding more resources to a database node) and horizontally (adding more database nodes to your system) while remaining performant and reliable.
- Maintainability. Consider the time and effort it will take to maintain it long-term—backups, replicas, data retention, archiving, automation, etc. Consider the maintenance jobs you want to do and see if the database has the tools and features to help you.
- Reliability. You might notice that many companies in this space are developing brand-new technologies from scratch. As time-series data quickly becomes the basis of business decisions and forecasts, you need to be sure that it will be available when you need it.
- Query language. There are also quite a few time-series databases on the market with custom languages (e.g., Flux by InfluxDB). Many developers dislike this because they need to learn a new language just to use the database. And even though some of them try to look like SQL, they are not real SQL.
Get Started With Timescale
Timescale has remained focused on product stability from day one—whether you're dealing with time series or vector data. PostgreSQL is and will remain the rock-solid foundation upon which we are building a solid cloud platform for your evolving data needs.
Here are some of TimescaleDB's superpowers:
- Built on PostgreSQL: Timescale leverages the reliability and robustness of PostgreSQL, a proven and widely used relational database. We believe you should use Postgres for Everything, and we're taking that lesson to heart in our product.
- Focus on stability: Timescale maintains a stable design, reducing technical debt and ensuring reliability. Hypertables, continuous aggregates, and compression are our core features, and we have continued to build on them to provide a stable PostgreSQL platform in the cloud.
- SQL compatibility: Timescale uses standard SQL, making it easier for developers to adopt and integrate into existing systems.
- Allow developers to scale PostgreSQL: Timescale's focused approach is on helping developers stay with PostgreSQL as they scale.
If you're running your PostgreSQL database on your own hardware, you can simply add the TimescaleDB extension. If you prefer to try our mature PostgreSQL cloud in AWS, create a free Timescale Cloud account today. It only takes a couple of seconds, no credit card required.
Time-Series Database Resources
If you are interested in how other developers are using TimescaleDB for time series and analytics, check out our developer stories:
- How Octave Achieves a High Compression Ratio and Speedy Queries on Historical Data While Revolutionizing the Battery Market
- How Newtrax Is Using TimescaleDB and Hypertables to Save Lives in Mines While Optimizing Profitability
- How to Reduce Query Cost With a Wide Table Layout in TimescaleDB
If you want to keep reading about TimescaleDB, these articles will help you learn more:
- An introduction to hypertables
- More time-series data, less lines of code: meet hyperfunctions
- Using materialized views for time-series data analysis
- Allowing DML operations in highly-compressed data via Timescale compression
- Timescale Cloud performance benchmark vs. Amazon RDS PostgreSQL