Oct 18, 2024
Posted by
Ramon Guiu
If the growth of IoT (Internet of Things) is by now a given, with reports suggesting two-digit growth and 34.4 billion connected IoT devices worldwide by 2032, the development of industrial IoT (IIoT, for short) is one of the segments propelling such massive expansion.
IIoT refers to the application of IoT technology in industrial settings, using smart sensors, machines, and devices to collect and analyze data to improve efficiency, productivity, safety, and operational insights. As more and more companies use IIoT data in critical industrial applications for sectors like energy, transportation, and manufacturing, this segment is expected to grow to 3.3 trillion U.S. dollars by the end of the decade.
However, despite its impressive growth, developers building IIoT applications struggle with many of the same problems as devs building data-intensive applications: analyzing and storing the deluge of timestamped data (like sensor data) with other relational data without relying on multiple databases and complex data stacks. There surely must be a better solution to drive fast business decisions, right? Plus, can these solutions ensure SCADA systems, the foundation of industrial applications, keep running seamlessly?
With IIoT customers representing 66 % of our IoT client base here at Timescale, we can confidently say you don’t need a specialized database for your industrial IoT data. You need PostgreSQL. Before you can mutter, “PostgreSQL doesn’t scale for large amounts of sensor data,” in this blog post, we’ll look at the challenges of IIoT data, including SCADA systems, and explain how you can simplify your data stack. Finally, we’ll share examples of users who did just that.
SCADA (Supervisory Control and Data Acquisition) systems are a critical component of IIoT. At its core, SCADA is a type of software application used to control industrial processes and machinery. It operates by collecting data from sensors and equipment on the factory floor or in the field (such as in oil rigs, water treatment plants, or power generation facilities), analyzing this data, and then using it to control and optimize those processes in real time.
A typical SCADA system consists of several key components:
Let’s focus on this last bullet point. If operational historians have been the gold standard for operational process data, like IIoT data, many industry insiders are now advocating that these expensive and often outdated systems do not play well with the plethora of sensors and external systems currently generating data (or data visualizations). Instead, they recommend using modern, scalable, reliable databases.
But are modern database systems capable of handling the complexity of SCADA systems? To answer this question, let’s first examine their unique challenges.
Many industrial operations monitored by SCADA systems are located in remote areas where connectivity might be limited or unreliable. This includes oil and gas pipelines, wind farms, and water treatment facilities spread over vast distances. For this reason, SCADA systems often need the following:
Given the critical nature of the operations they monitor and control, SCADA systems cannot afford to lose data during downtimes or connectivity issues. They need the ability to backfill data, ensuring that once the system is back online, it can update the central database with all the information gathered during the period of disconnection. This completeness of data is vital for accurate monitoring, analysis, and decision-making.
SCADA systems collect more than just numerical values; they need to process a wide range of data types:
Complex data types: Including images or video feeds from surveillance cameras, PDFs for manuals or schematics, and audio files for alarms or verbal warnings.
State data in the form of strings: These are critical for understanding the status of switches, valves, or modes of operation (e.g., "Open," "Closed," "Automatic," "Manual"). This type of data provides insights into the operational state of the equipment being monitored.
Numerical values: Such as temperatures, pressures, and flow rates, which are essential for monitoring operational parameters.
Most of this data is captured as time series, that is, data that reads as successive points in time, making it crucial for trend analysis, alerting, and predictive modeling.
“There is no way a general purpose database like PostgreSQL can ensure all of this, especially at this scale,” you’re now thinking. We’ll get there, but let’s first follow the path of a developer or engineering team trying to navigate the complexities of IIoT data.
If you’re building an IIoT application, you’re familiar with the challenges SCADA systems pose and their importance to your application. For this reason, you and your team will probably opt to build an industrial sensor data solution on top of battle-tested, robust database technology, typically PostgreSQL, MySQL, SQL Server, Oracle, or even MongoDB.
As mentioned, IIoT applications need to process different data types, including time-series data—the timestamped measurements collected by sensors. However, these solutions also store a lot of what you would typically consider traditional relational data: customer and user accounts, facilities, equipment, sensors linked to each other, etc.
As your application’s adoption grows and data accumulates, your rock-solid general-purpose database starts showing some worrying signs:
By now, you realize you need to solve the problem. It’s time to reach out for a time-series database, introducing yet more complexity for your team:
“So why did you say PostgreSQL could solve all these problems?”, you’re wondering. Because it can. You can use PostgreSQL for everything. You just need to ensure it’s PostgreSQL engineered for high performance.
Timescale began as an IoT company, so we’re well acquainted with many of these challenges—in fact, they’re what led us to create Timescale. We have always advocated for using PostgreSQL to store IoT data, but as PostgreSQL evolves and becomes an increasingly sophisticated platform and ecosystem, we expanded that belief to using PostgreSQL for everything.
You don’t have to compromise your favorite database—an opinion shared by increasing numbers of developers—or rely on outdated historians to get an end-to-end solution. You can simply expand PostgreSQL by making it powerful with Timescale.
With Timescale, you benefit from a single database that efficiently handles both time-series data and traditional relational data. This eliminates the overhead of maintaining separate systems and simplifies your data architecture.
One company that achieved just that was Everactive, which began operating in the IIoT space and has since expanded to traditional IoT, combining self-powered sensors and powerful cloud analytics to provide “end-to-end” IoT solutions.
“The capacity of Timescale to support both traditional schemas and time-series data in the same database allowed us to consolidate into one storage solution.”
Carlos Olmos, senior principal engineer at Everactive
Additionally, Timescale supports the rich data model of PostgreSQL, allowing developers to use complex joins, window functions, and custom user-defined functions without the limitations often introduced by specialized time-series databases. This flexibility supports more sophisticated analyses and applications.
Since Timescale is built on PostgreSQL and leverages its entire ecosystem, developers can use SQL, a familiar and powerful query language, across all their data. There’s no need to learn a new or specialized query language, which accelerates development and reduces errors.
“It quickly reached the point where we had to implement Python scripts to process data because Flux had reached its limits in use cases that would work seamlessly using SQL.”
Jeremy Theocharis, co-founder of United Manufacturing Hub
Hypertables are Timescale’s foundation. They automatically partition your PostgreSQL tables into smaller data partitions called chunks and instantly speed up your queries. They are optimized for time-based partitioning but also work for tables that have something similar to a time element.
Timescale’s hypertables are a crucial feature for Newtrax, which integrates the latest IIoT and analytics technologies to monitor and provide real-time insights on underground mining operations, including people, machines, and the environment.
“More importantly, in order for our solutions (such as predictive maintenance and collision avoidance) to provide contextualized and accurate results, we must gather and process hundreds of millions of data points per machine or worker, per week or month, depending on various circumstances. We use hypertables to handle these large datasets and act upon them. We have saved lives using TimescaleDB.”
Jean-François Lambert, lead data engineer at Newtrax
By consolidating your data needs into Timescale, you reduce the total cost of ownership by paying for one database instead of two. This consolidation also means savings in operational costs, as managing a single database system is simpler and less resource-intensive.
But Timescale has more features that help you reduce your storage footprint effectively:
DELETE
statements.“Compression was a game-changer from our perspective as a company: not having to worry about getting databases on the order of 5, 10, or 15 TB to store this information was a massive factor for us.”
Michael Gagliardo, Software Architect at Ndustrial
To try all these benefits today and start storing your IoT or IIoT data in a more efficient and performant way, create a Timescale account. It's free for 30 days, no credit card required.