Apr 05, 2024
Posted by
Vineeth Pothulapati
My daily work as a product manager involves developing and advancing observability products. So, it wasn’t surprising that I wanted to explore a topic I spent so much time reflecting on during my talk at the 2022 KubeCon|CloudNativeCon North America.
I'm a product manager at Timescale and a maintainer of the OpenTelemetry Operator. If you’re using it, I’d love to hear about your experience—drop me a line in the Timescale Community Slack (@Vineeth)!
When Jaeger (a distributed tracing tool many of us have been using and loving for a while now) announced its client libraries’ end-of-life earlier last year, I knew I wanted to present an alternative you could migrate to at KubeCon. The Jaeger community and its maintainers have supported and advocated for the OpenTelemetry SDK, encouraging me to focus on migrating from one tracing tool to the other.
So if, like myself, you’re using Jaeger and have to move from its client libraries to the OpenTelemetry SDK, check out my presentation below. I’ll summarize the main points in this blog post, but for a complete picture—including an OpenTracing API demo—I recommend you watch the video.
I divided my presentation agenda into the following topics:
As you may have guessed, let’s start with the prerequisites.
While OpenTelemetry (which I often abbreviate to OTel during the presentation and in this post) also supports metrics and logs alongside traces, my talk focused on traces. And above all, I didn’t intend to push you to migrate or sell anything. I just wanted to share how you can mix and match things and the upsides of migration.
But for that, I had first to explain all the components involved.
There are multiple components in the tracing world, both in Jaeger and OpenTelemetry. This means the instrumentation layer usually comprises an API and SDK.
We use the OpenTracing API and Jaeger client libraries as the SDK in Jaeger. The agent/collector is all Jaeger and offers some native storage options within the collector. It also has a visualization layer called Jaeger query, which I discussed in the presentation, and helps you visualize the traces even if you’re using OTel.
In OpenTelemetry, there’s only an instrumentation layer and the collector layer. So, while Jaeger goes all the way from instrumentation to storage and visualization, OTel is purpose-built for instrumentation and collection only.
The OpenTelemetry project was precisely announced in 2019 at KubeCon (San Diego). Since then, it has been evolving and expanding into different observability signals and adding new capabilities. Let’s discuss what those capabilities are and why a migration would make sense.
I already mentioned the first reason: Jaeger stopped supporting its client libraries in favor of OTel SDKs. The second is that OpenTelemetry is a new instrumentation and data collection standard. It takes some of the industry’s best practices that were part of open tracing and open census and adds new capabilities that modern cloud-native applications need.
That makes OTel’s collector layer incredibly rich: it allows you to configure different sources and destinations to ship your data while supporting auto-instrumentation—ensuring no code changes are involved. It also offers processors to enrich the data while it’s received and exported.
Before we get into the two levels of migration—the instrumentation layer and the collector layer—let me quickly walk you through the Jaeger and OTel architectures.
You can use Jaeger to instrument your application, with spans being pushed to the Jaeger agent/collector. From there, you’ll find the storage backend and the user interface (UI). This is the complete ecosystem and components involved in the Jaeger architecture. The spark jobs are optional: you can run them if you need to.
Now for the OpenTelemetry architecture: we cannot see the UI layer or the storage layer. It's all about the instrumentation and the data processing pipeline, which is OpenTelemetry’s Collector, which can be run as an agent based on the machine it’s part of.
People often need clarification on the agent and the collector. The agent is run within the whole store as a sidecar to the application. The collector acts as a centralized processing pipeline where your applications can directly send the spans to the collector, which the agent can perform.
When considering a migration from Jaeger to the OpenTelemetry SDK, the first level is the instrumentation layer. As mentioned, there is an API and an SDK. The API contains a tracer, the API itself, the context API, and the meter APIs for metrics. In the SDK, you will find a propagator, a span processor, and an aggregator. So these are the functionalities that the API and SDK offer you during instrumentation.
You can complete the migration in the instrumentation layer in two ways:
1. OpenTelemetry shim
The shim is a library that facilitates the migration between OpenTracing and OpenTelemetry. It consists of a set of classes that implement the OpenTracing API while still using the OTel constructs behind the scenes.
You’ll find a great explanation in this blog post written by Juraci Paixão Kröhling, in which he uses the Java application as a demo. With minimal code changes, it will hardly take five minutes to migrate by swapping the dependencies and the imports.
If you have less bandwidth and want to use client-based sampling in OpenTelemetry or simply want to try the OpenTelemetry SDK for some reason, you can definitely start with shim.
2. Complete re-instrumentation
A complete re-instrumentation is the second way to help you get on the OpenTelemetry SDK, which offers OTel as a package. This means you will get all the capabilities from scratch, from the code to semantics. In the future, you can also expand your OTel instrumentation into metrics and logs and easily integrate it with auto-instrumented applications.
So if you want to do auto-instrumentation for a few applications and others for which you need more granular detail, auto-instrumentation will give you higher-level traces. With manual instrumentation, you will have more flexibility over what you want to capture and what you want to measure.
My re-instrumentation demo is a clone of an OpenTracing tutorial authored by Yuri Shkuro and lives on GitHub—it will help you understand how instrumentation works.
I took the same application and showed you both the Jaeger and OTel instrumentations. The demo is available in this GitHub repository. It aims to show you how simple it is to run and use the Jaeger and OpenTelemetry instrumentation alongside one another.
Let’s now move on to the basics of the second migration level: the collector layer. This layer will allow you to migrate from the Jaeger to the OpenTelemetry Collector without touching the code or disturbing your applications.
When deploying the Jaeger Collector, you can simply add the OpenTelemetry Collector into your existing architecture without making OTel code changes to your applications. The OTel Collector can receive data from Jaeger and other different formats.
Check out my talk to see the differences between both collectors and how to configure them.
In sum, why should you use OTel in Jaeger?
Finally, one of the most crucial aspects of the migration is how you will keep querying and visualizing your traces. If you move completely to the OpenTelemetry collector, there is no path to visualize traces using the Jaeger UI unless the storage backend offers support for querying and visualizing traces using the Jaeger query component.
The OTel project is all about collecting, instrumenting, and collecting data. Whereas with Jaeger, you can use the Jaeger UI to visualize the data. So if you are moving from Jaeger to OpenTelemetry, you should integrate the Jaeger UI to fully query your traces.
If you want to learn more about how the OpenTelemetry Collector interacts with Jaeger to aggregate traces into Prometheus metrics and graph them inside Jaeger’s UI, check out this blog post by Timescale’s own Mathis Van Eetvelde.
In fact, these are some of the Jaeger-OTel boundaries: OTel is all about the API, SDK, and the OpenTelemetry Collector. Jaeger is the query, mature native storage backend. So in the future, I see Jaeger evolving into a platform for traces, whereas OTel will be more like an instrumentation and collection pipeline for all the observability data.