Time-Series Forecasting With TimescaleDB and Prophet
Time-series forecasting is a cornerstone element of data analysis for all organizations, developers, or analysts looking to better understand the world around them through data. It enables the prediction of trends in stock markets, product demand, or even forecasting climate patterns with almost complete accuracy. All you need is the right set of tools to implement it.
In this tutorial, we’ll explore time-series forecasting with TimescaleDB and Prophet: two tools that, when brought together, can simplify and enable efficient data analysis. But these tools are not just about data analytics. They allow you to understand the when and the why, adding color to your time-series analysis.
You can find all the steps and Python scripts used in this blog post on GitHub.
Why is time-series forecasting important
Time-series forecasting involves applying statistical methods to historical data to predict future values. These analyses unravel data patterns, trends, and seasonality to extract valuable foresight. It is widely used in many domains, such as finance, economics, weather forecasting, sales forecasting, inventory management, etc.
More effective if handled with the right tools, time-series forecasting is about understanding the data at hand. It's about learning to listen to your data's hints, understanding its story, and predicting what will come next.
This is why it is essential to equip yourself with the proper forecasting tools capable of dissecting complex data to illuminate obscured patterns. In the next section, we’ll quickly introduce the forecasting tools we’ll be using in this tutorial before we show you how to use them.
Using a Time-Series Database for Forecasting
Think about managing a dataset that holds a global enterprise’s financial data for years, with new records flowing in every second. Traditional databases will struggle to handle the speed and scale of data pouring in—but not TimescaleDB.
TimescaleDB is a time-series database built for rapidly ingesting massive quantities of data with complex access patterns for forecasting-type operations. It is based on PostgreSQL and works like PostgreSQL, making the most of its reliability and rich ecosystem of connectors and tools. Here’s how it supports time-series forecasting:
Scalability
One of the key features of TimescaleDB is that it can scale horizontally, efficiently handling billions of rows of data with the help of hypertables, which automatically partition your tables, speeding your performance. Then, through columnar compression, Timescale helps you manage the data deluge, allowing you to save on storage. If you need further scalability, Timescale has a trick up its sleeve as part of its backend architecture: a low-cost, infinite storage tier where you can store your older, infrequently accessed data while still being able to access it.
Continuous aggregates
TimescaleDB's continuous aggregates are an automatic and incrementally updated version of Postgres materialized views, speeding up queries and reducing the time spent analyzing time-series data.
Users can now calculate and store views of data using pre-aggregated views, which enables real-time analytics to power user-facing dashboards and analytics in your application. This substantially shortens query times for some of the most common analytical operations, such as getting daily averages or monthly totals.
Data retention policies
Keeping track of historical data could be a complex task due to regulatory requirements or data storage capabilities. TimescaleDB simplifies it with customizable data retention policies, where old data can be purged easily. This is done to optimize storage usage and return queries with relevant information on time.
Full SQL support
TimescaleDB offers full SQL support for writing and evaluating complex queries. SQL is the lingua franca for data analysis, making it easier to interact with your database—no learning curve involved.
Prophet
Prophet is a forecasting tool designed by Facebook to work easily and without the necessity of specialist knowledge in time-series forecasting. It works well for daily observations that exhibit patterns on different time scales and is suitable for a wide range of applications. Going into greater detail, here's what makes the Prophet special:
Automatic detection of trends and seasonality
One of Prophet’s key features is automatically detecting trends and seasonality from time-series data. It uses a decomposable model that decomposes the data into trends, seasonality, and holidays.
Prophet fits into the model in a piecewise linear manner. By detecting the trend, it can catch long-term changes in trend and short-term fluctuations by fitting multiple changepoints.
The seasonality component is found in weekly, monthly, and yearly terms, which are designed inside the model through the Fourier series in Prophet. It captures yearly and weekly patterns inherently.
It lets you tune seasonality, input the ability to detect special events, and even adjust the change points for the trend. This level of customization means that your forecasts can be tailored to the unique rhythm of your data.
Dealing with missing data and outliers
In the world of data, imperfection is king. Datasets are often marred with incompleteness, cluttered with inconsistencies, and plagued by anomalies. However, the Prophet navigates through this landscape with remarkable ease and precision.
It handles missing data points well, cleaning outliers to make the forecasts resilient and dependable. This supports the user in getting a profound sense of and being capable of informed decisions despite the intrinsic intricacies within their data.
Prophet isn't only about forecasting; it also helps you understand what it tells you. With functions for diagnostics and cross-validation, it opens a window into your forecasts' performance. This allows you to refine and improve, ensuring that each prediction is better than the last.
Prophet is an open-source product whose development is based on ideas generated through community proposals and collaboration.
Time-Series Forecasting With Prophet and TimescaleDB
Now that we’ve introduced both tools, let’s start forecasting with the Prophet library and TimescaleDB.
Install TimescaleDB and set up the database
The first thing you need to do is create a free Timescale account (where many Timescale advanced features are included by default) or install the TimescaleDB extension. After the installation, create a database and prepare the tables to store your time-series data. If you know nothing about TimescaleDB, go to the Timescale docs to get started.
Insert your time-series data into TimescaleDB
Ensure that the data in the hypertable is stored in TimescaleDB, which is optimized to keep time-series data. To create your first hypertable, you’ll need to create a PostgreSQL table and convert it into a hypertable. Use SQL queries to insert your time-series data into TimescaleDB.
Python libraries
Use pip from the shell to install Prophet and its dependencies. Also, ensure you have the other necessary libraries, including Pandas and NumPy. Use the below code snippet to install the libraries used for this tutorial:
Modeling using Prophet
Let's say you have a sales dataset stored in your TimescaleDB repository. We have used the following code snippet to import data from TimescaleDB:
After importing the dataset into the panda's data frame, you can build a forecasting model using Prophet. Below is the code snippet we used to develop the Prophet model for different times, including daily and weekly predictions:
For the hourly prediction, you can use the following code snippet as well if you want to make a forecast for one day:
To plot the results, we have used Plotly to select the specific data for the actual versus predicted values for different time frames and zoom-in/zoom-out options. Here is the code snippet that we have used to perform these operations:
The output of these plots is given below:
Hourly Sales Forecast
Daily Sales Forecast
Weekly Sales Forecast
Yearly Sales Forecast
The prediction of the sales forecast is below:
Here's what we have obtained from the plot:
- Black dots: Each black dot is a historical sales value in the dataset.
- Blue line: This is a representative of the trend, which the model has calculated with the data, with the predicted sales values. The line lies through the area of lower density of points.
- Blue shaded area: This represents the forecast's uncertainty interval. It shows the forecast of future sales in one interval with some probability. The width of the shaded area gives the level of uncertainty: the wider area suggests more uncertainty in the forecast.
This will enable businesses to project future sales and hence work along the lines of those numbers to make decisions concerning inventory, staffing, resource management, and other strategic business components.
Although the Prophet code does not specifically mention TimescaleDB, it could be used to analyze time-series data in conjunction with the optimized storage and subsequent retrieval of such data. The code for the entire tutorial is available on GitHub.
FAQ
What are the limitations of time-series forecasting in TimescaleDB and Prophet?
Although they’re a solid framework designed for time-series analysis, both TimescaleDB and Prophet show how forecast accuracy is still influenced by data quality, the inherent unpredictability of phenomena, and model parameters. Therefore, iterative model refinements and considerations of all related external factors are required.
Can TimescaleDB and Prophet handle real-time forecasting?
Yes, because TimescaleDB can handle real-time data ingestion, and Prophet provides quick methods to make the forecasts desired. This would make them suitable for real-time forecasting. The actual latency depends on the volume of data, system architecture, and model complexity.
How can I optimize the Prophet model towards a better accuracy?
Optimizing the Prophet model will involve parameter tuning to the various seasonality, holidays, and trend components in light of the characteristics of your dataset. Cross-validation with some parameter sweeping could work for you to get a hint of what is the most effective configuration.
What resources are available for learning more about TimescaleDB and Prophet?
For TimescaleDB, its official documentation and tutorials are excellent starting points. For Prophet, the GitHub repository and accompanying documentation offer comprehensive guides, examples, and best practices for forecasting.
Next Steps
The world has been changing fast and is likely to continue increasing at a rapid pace. Businesses and devices are generating data in volumes that grow exponentially. So, effective data management and predictive analytics are critical.
The integration of scalable storage solutions such as Timescale and advanced forecasting algorithms such as Prophet is a perfect proposition to help address those challenges with flexible platforms that can fit the demands of change in the data analysis world.
As organizations move towards data-driven decision-making capabilities, technologies such as TimescaleDB and Prophet are important in unlocking the potential of time-series data. By joining the two, you have a way to determine actuality and predict the future. This empowers businesses to forecast change, refine their operations, innovate, and stay ahead.
To try Timescale, create a free account today.
Learn more: