-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add per-filter spans for distributed tracing #37339
Comments
An alternative I have used to troubleshoot potential intra-Envoy latency issues is the |
To have closer parity to the Apigee Trace tool, maybe it might be worth considering adding this per-filter data to the Tap filter? It would be ideal if for each filter, we can see both:
|
Nothing is free, more detailed tracing means more overhead in the key path and more complex code base, I think we cannot accept further decrease of the envoy performance. And in most cases, most users needn't this because the filter of envoy basically is super fast. And if there is performance problem, Of course, I am not mean the feature is senseless, I only mean it may doesn't deserve the overhead and investment. I think the best althernative would be the The |
Thank you for the reply, I appreciate the feedback and information on this feature suggestion. If more detailed tracing will add more overhead/latency to the key data plane path, then I agree that it probably is not worth adding (unless it was in some sort of "debug" binary separate from the standard Envoy binary). I am not familiar with the For |
For the standard Envoy filters this is totally reasonable, but my concern is for custom filters built in-house by Envoy operators or by Envoy-based vendors (Solo.io, Tetrate, etc). It is probably still more correct to use |
re to I mean the linux perf tool 🤣 |
Title: Add per-filter spans for distributed tracing
Description:
Certain API Gateway products enable debugging transactions as they are processed by the data plane at a granular policy or filter level. One example is Google's Apigee Edge which supports the Trace tool. The Apigee Trace tool allows for troubleshooting client-to-Apigee, Apigee-to-target, and intra-Apigee (between Apigee policies) transaction flows.
In Envoy, the distributed tracing filter(s) allow for the two former scenarios, but I do not see support for tracing intra-Envoy flows such as a transaction being processed by a series of L4/L7 filters. As of now, I see that Envoy emits one span for a transaction passing through the data plane to an upstream cluster and potentially more spans for remote service calls as part of an HTTP filter such as External Authorization or Rate Limiting.
It would be great if Envoy could emit more granular (internal) spans for each filter processed as part of a single transaction. This would enable easier troubleshooting in scenarios such as determining which synchronous filter(s) are adding latency to the overall transaction. For the most granular data, this might require each filter to implement its own tracing logic. However, it could be useful to generalize some tracing behavior managed by the Envoy worker thread across all L4/L7 filters such that we can at least get processing duration data for each filter.
The text was updated successfully, but these errors were encountered: