Nov 15, 2021
Posted by
Blagoj Atanasovski
Telegraf handles batching, processing, and aggregating time-series data prior to inserting it into TimescaleDB.
As a developer, I can’t even begin to convey the importance of open-source software and tools. At its core, open source builds collaborative communities who distribute their knowledge and build better software.
At Timescale, we are huge fans of open source and we commend the organizations and contributors who continue to invest (free) time into these projects. In particular, I’m going to discuss the open-source project Telegraf, and the new plugin we contributed to allow users to use this tool for ingesting data into PostgreSQL and TimescaleDB.
Back in 2017, one of my colleagues (Sven Klemm, who was not affiliated with Timescale at the time), wrote the PostgreSQL output plugin which included the ability to send data to a TimescaleDB hypertable, and dynamically modify the tables to handle changes in incoming data. This means that you don't need to worry about defining your schemas at all since the plugin automatically figures that out for you. We recently updated the code to increase the test coverage of the plugin.
This plugin contribution was submitted to Telegraf via a GitHub pull request. The pull request was approved 7 months ago but is still pending a merge by the Telegraf developers (awaiting acceptance and subsequent merge to become available in future Telegraf release). We hope to see this code accepted in the future but in the meantime, and due to popular demand, we went ahead and created a binary of Telegraf with the plugin already included.
Telegraf is an agent that collects, processes, aggregates, and writes metrics. Since it is plugin-driven for both the collection and the output of data, it is easily extendable. In fact, it already contains over 200 plugins for gathering and writing different types of data. Dating back to 2017, Telegraf has had numerous user requests for PostgreSQL compatibility. But why are people looking into using Telegraf with PostgreSQL and TimescaleDB?
TimescaleDB leverages PostgreSQL to offer a highly performant time-series database that supports full SQL and inherits the entire PostgreSQL ecosystem. TimescaleDB looks and feels just like PostgreSQL and benefits from its reliability, stability, and robust ecosystem of tools and extensions. Further, we’ve talked about PostgreSQL’s popularity in previous posts, which makes it a no-brainer to invest in as a developer. A well-known, battle-tested database is easier to hire for, easier to operate, and easier to use.
Users looking to store time-series data in TimescaleDB today need mechanisms with which to insert data into TimescaleDB. Because TimescaleDB operates and looks just like PostgreSQL, inserting data is very easy. Users can use PostgreSQL’s Kafka connector to connect with Kafka, language-specific drivers to write customer applications, or leverage third-party applications like Prometheus to load data into TimescaleDB.
Supporting Telegraf extends the optionality we provide to users by creating yet another method to quickly insert data into TimescaleDB. Telegraf offers a large range of input options, so users can pipe the data they want to collect directly to Telegraf. Telegraf handles things like batching, processing, and aggregating the data collected prior to inserting that data into TimescaleDB. This way, users don’t have to handle batching within their application and can instead leverage Telegraf’s functionalities.
Our PostgreSQL plugin extends the ease of use users get from leveraging Telegraf by handling schema generation and modification. This means that as metrics are collected by Telegraf, the plugin creates a table if it doesn’t exist and alters the table if a schema has changed. By default, the plugin leverages a wide model, which is typically the schema model that our users tend to choose when storing metrics. However, users can also specify that they want to store metrics in a narrow model with a separate metadata table and foreign keys. They can also choose to use JSONB. You can see how our plugin allows for ease of use without sacrificing the inherent flexibility that comes with PostgreSQL.
We’ve developed a full tutorial that walks through a three-step process of getting started, but here’s a quick overview.
The first step is installation. Since Telegraf is written in Go, only one standalone binary is required to run it. We have provided a build of Telegraf version with our plugin added for Linux, Windows and MacOS. The specific pre-built packages can be found here.
Next, we get into the actual configuration of Telegraf which includes testing out the config file and configuring the PostgreSQL output plugin.
Finally, we get to the fun part and demonstrate how to run Telegraf and connect to PostgreSQL and TimescaleDB. Read our tutorial for more, including information on creating hypertables, adding new tags or fields, creating separate tables for tags, and storing the tags/fields as JSONb columns in the database.
If you are ready to get started, dive into the full tutorial here.
Telegraf, like Kafka, Prometheus, and numerous other tools, is now an easy way to ingest time-series data into TimescaleDB. For users who are looking to migrate to PostgreSQL and TimescaleDB, Telegraf can also be a useful tool to support live migrations.
If you’re using InfluxDB, we’ve also created a tool called Outflux which helps users batch migrate data from InfluxDB with a single command. While both tools require very few steps to get started, the PostgreSQL and TimescaleDB output plugin allows you to do live migrations when exporting data from InfluxDB.
If you are new to TimescaleDB, follow these installation instructions to get an instance of TimescaleDB running in your environment. Another option is to sign up for Timescale Cloud which provides a fast and easy way to spin-up a fully-managed instance of TimescaleDB. If you sign up today, you’ll get $300 in trial credits to use for the next 30 days.
Like this post and interested in learning more? Sign up for our mailing list below or follow us on Twitter.