-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(opentelemetry): Stop looking at propagation context for span creation #14481
Conversation
size-limit report 📦
|
❌ 6 Tests Failed:
View the top 3 failed tests by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
9e88097
to
4b096ed
Compare
Noticed this while working on #14481. The way we try-catched the astro server request code lead to the http.server span not being attached to servers correctly - we had a try-catch block _outside_ of the `startSpan` call, where we sent caught errors to sentry. but any error caught this way would not have an active span (because by the time the `catch` part triggers, `startSpan` is over), and thus the http.server span would not be attached to the error. By moving this try-catch inside of the `startSpan` call, we can correctly assign the span to errors. I also tried to add some tests to this - there is still a problem in there which the tests show, which I'll look at afterwards (and/or they may get fixed by #14481)
4b096ed
to
0362bd0
Compare
// Also ensure sampling decision is correctly inferred | ||
// In core, we use `spanIsSampled`, which just looks at the trace flags | ||
// but in OTEL, we use a slightly more complex logic to be able to differntiate between unsampled and deferred sampling | ||
if (hasTracingEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is kind of not exactly what the name implies this does, but I think it's OK to add this here too. Otherwise, we'll have to export another, new method here and use this in the Node SDK initOtel
, which seems not really worth it here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename hasTracingEnabled
anyway, which we can do during v9
fee464e
to
dae27c9
Compare
@@ -26,6 +26,7 @@ test('Sends exception to Sentry', async ({ baseURL }) => { | |||
expect(errorEvent.contexts?.trace).toEqual({ | |||
trace_id: expect.stringMatching(/[a-f0-9]{32}/), | |||
span_id: expect.stringMatching(/[a-f0-9]{16}/), | |||
parent_span_id: expect.stringMatching(/[a-f0-9]{16}/), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not really know why this was not here before, but IMHO it was incorrect? These errors (same for the other nest E2E tests), as you'll usually have the http.server span and then inside it some e.g. route handler span or so, so the active span at the time of the error should usually have a parent_span_id. So I'd say this fixes incorrect (?) behavior 🤔
// But our default node-fetch spans are not emitted | ||
expect(scopeSpans.length).toEqual(2); | ||
expect(scopeSpans.length).toEqual(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was also incorrect before, spans from startSpan()
did not end up here, although they should. This was probably because of incorrect propagation in this scenario.
@@ -26,6 +26,6 @@ describe('awsIntegration', () => { | |||
}); | |||
|
|||
test('should auto-instrument aws-sdk v2 package.', done => { | |||
createRunner(__dirname, 'scenario.js').expect({ transaction: EXPECTED_TRANSCATION }).start(done); | |||
createRunner(__dirname, 'scenario.js').ignore('event').expect({ transaction: EXPECTED_TRANSCATION }).start(done); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated, but saw this flaking every now and then so decided to just ignore events here.
b7f5861
to
c0e8566
Compare
a2cbf94
to
dee24ea
Compare
dee24ea
to
f61a993
Compare
// Also ensure sampling decision is correctly inferred | ||
// In core, we use `spanIsSampled`, which just looks at the trace flags | ||
// but in OTEL, we use a slightly more complex logic to be able to differntiate between unsampled and deferred sampling | ||
if (hasTracingEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename hasTracingEnabled
anyway, which we can do during v9
This PR changes the behavior of the OTEL-based Node SDK to ignore the propagation context when starting spans.
Previously, when you called
startSpan
and there was no incoming trace, we would ensure that the new span has the trace ID + span ID from the propagation context.This has a few problems:
This PR fixes this by simply not looking at this anymore. For TWP and error marking the propagation context is still used as before, only for new spans is there a difference.
I also added docs explaining how trace propagation in node works now: