OpenTelemetry promised to be a unified standard that would make it easier for everyone to collect and correlate traces, logs and metrics from distributed systems. Sounds like a dream, right? Well, here is the hard truth: OpenTelemetry is only satisfactory. When I compare it to deeper tracing technologies such as eBPF, OpenTelemetry feels bloated, inefficient, incomplete and missing a key component of tracing. And much of this is thanks to the corporate hijacking of what should have been a lean, community-driven project.
Let’s dive into why eBPF and OpenTelemetry have a place in the world, but also why I think OpenTelemetry is only a support team player and eBPF is the real MVP.
OpenTelemetry: The Swiss Army Knife That’s Trying Too Hard
OpenTelemetry is like a Swiss Army knife I got for my birthday: It has tons of tools, but all of them are only mediocre. It tries to do everything — logs, metrics and traces, with dozens of integrations. It provides a standardized, vendor-neutral way to gather observability data. In a world full of fragmented tools and monitoring solutions, that is valuable. But its broad scope has come at a cost: It is bloated, slow and inefficient.
So, why is OpenTelemetry bloated?
1. Corporate Co-Opting and Feature Creep
OpenTelemetry started as an open-source project with a clear and focused vision. But, like many open-source projects that gain attention, large enterprises saw an opportunity. They swooped in, contributing resources and features — but not necessarily out of altruism. These companies have their agendas, adding features to ensure their platforms, cloud services and proprietary tools are covered. As more enterprises pile on, OpenTelemetry must stretch more to accommodate everyone.
What started as a simple and elegant solution, has now become a bloated beast. We ended up with a tool that has too many knobs, options and complexity. Every company wants their special requirements baked into the core of the project, leading to feature creep that bogs down OpenTelemetry’s performance. The more features we add, the more we dilute the focus and efficiency of the tool.
I’ve been around enough to see it in the wild: The complete inefficiencies of the OTel collector cause an unreasonable and absurd amount of scaling, resources and capacity just to keep up with a marginally busy environment. This becomes unworkable and unmanageable on any reasonable scale!
2. Inefficiency: Trying to be Everything for Everyone
As OpenTelemetry tries to cover everything, it excels at nothing. When I trace a request using OpenTelemetry, I witness a performance overhead that makes me wonder if it is worth it. While it is great at a high level — tracing application flow or showing where bottlenecks might be — it lacks the precision and depth you may need to dig into the nitty-gritty.
This inefficiency stems directly from OpenTelemetry’s broad mandate. It is not just trying to capture traces, it is also juggling metrics and logs, all while integrating with hundreds of other tools and systems. The overhead becomes more noticeable at scale, making it difficult to recommend for performance-critical environments. When you need precise, real-time insights, OpenTelemetry can feel like an anchor dragging behind the app.
3. Incomplete: Just ‘Good Enough’ for Enterprises to Cash In
Here is the kicker: Despite all the bloat and inefficiency, OpenTelemetry still feels incomplete. This is because large enterprises are more interested in creating an open-source product that is ‘good enough’ to get us in the door but not so good that it satisfies all our needs. If you want the full observability package — optimized performance, advanced analytics, smooth scaling — you would be nudged toward their proprietary tools and services. It is a classic bait-and-switch.
In other words, OpenTelemetry is functional enough for basic observability, but when things get complex, you are often pushed into premium enterprise add-ons. It is not a coincidence — this is part of the corporate strategy. They contribute just enough to the open-source project to make it widely adopted but hold back the best features for their paid offerings.
eBPF: The Low-Level Tracing Tool That Just Does it Better
What is the alternative to OpenTelemetry’s bloated mediocrity? eBPF — the hero we didn’t know we needed. While OpenTelemetry operates at a high level — instrumenting applications and services — eBPF (extended Berkeley packet filter) works at the kernel level. It is the secret sauce for real-time, low-overhead observability, generating insights directly from the operating system. For instance, if you wish to know exactly why your network latency is spiking, or which process is causing a performance bottleneck — eBPF has you covered.
Here is why eBPF is the real MVP.
1. Lightweight and Fast
Unlike OpenTelemetry, which can feel bloated and sluggish, eBPF is lightweight and incredibly efficient. It doesn’t try to do everything — it only generates raw, real-time data from the kernel. eBPF allows you to observe and manipulate low-level system behavior, from I/O operations to network traffic, without a significant overhead.
In environments where performance matters, eBPF is a go-to tool. It provides deep visibility into how the system is behaving without the burden of the extra layers and complexity that come with OpenTelemetry.
2. Granular, Deep Insights
eBPF provides visibility that OpenTelemetry can’t match. While OpenTelemetry is great for tracing requests across services, eBPF gives insights at the kernel level. If you need to know why your app is consuming too much CPU or why I/O operations are slowing down, eBPF allows you to see the exact system events that are causing the issue. It is the difference between listening to the weather forecast and being able to measure the air pressure, wind speed and humidity in the backyard.
While OpenTelemetry gives us a 10,000-foot view, eBPF lets us zoom in to a molecular level.
3. Not Co-Opted by Enterprises (Yet)
eBPF remains a relatively niche tool, which means it hasn’t yet been co-opted by enterprises in the way OpenTelemetry has. It is a powerful, kernel-level technology that has yet to be tainted by corporate agendas. It still does what it was designed to do: Provide deep, granular insight into the system with minimal overhead. It hasn’t been bloated with unnecessary features yet and is not trying to funnel us toward a paid solution.
OpenTelemetry is Great for the Big Picture, But eBPF Unlocks the Depths
At the end of the day, OpenTelemetry and eBPF have their place in the observability stack. OpenTelemetry, with its broad adoption and standardized approach, excels at providing high-level visibility across distributed systems. It is the go-to solution for distributed tracing, offering a cohesive view of how services interact in complex environments. For anyone trying to make sense of a multi-service architecture, OpenTelemetry provides a much-needed map.
However, when it comes to deeper, more granular insights — to understand what is happening within the system at the kernel level — eBPF is a must-have. Its strength lies in its ability to capture low-level data with minimal overhead, offering raw and detailed visibility into system performance and behavior. It is perfect for digging into performance bottlenecks, resource contention or network issues that might not be visible through high-level telemetry.
Rather than seeing these tools as competing solutions, consider them as complementary. OpenTelemetry offers you a big-picture view, while eBPF allows you to zoom in when you need to troubleshoot with precision. If you are serious about observability, combining OpenTelemetry’s ease of use and standardized distributed tracing with eBPF’s power and depth will be the key to a comprehensive understanding of your system.
See you in the kernel space!