feat(opentelemetry): Stop looking at propagation context for span creation #14481

mydea · 2024-11-26T12:46:26Z

This PR changes the behavior of the OTEL-based Node SDK to ignore the propagation context when starting spans.

Previously, when you called startSpan and there was no incoming trace, we would ensure that the new span has the trace ID + span ID from the propagation context.

This has a few problems:

Multiple parallel root spans will continue the same virtual trace, instead of having separate traces.
This is really invalid in OTEL, as we have to provide a span ID and cannot really tell it to use a specific trace ID out of the box. Because of this, we had to add a bunch of special handling to ensure we can differentiate real and fake parent span IDs properly.

This PR fixes this by simply not looking at this anymore. For TWP and error marking the propagation context is still used as before, only for new spans is there a difference.

I also added docs explaining how trace propagation in node works now:

github-actions · 2024-11-26T12:51:32Z

size-limit report 📦

Path	Size	% Change	Change
@sentry/browser	23.12 KB	+0.12%	+27 B 🔺
@sentry/browser - with treeshaking flags	21.85 KB	+0.07%	+15 B 🔺
@sentry/browser (incl. Tracing)	35.51 KB	+0.03%	+8 B 🔺
@sentry/browser (incl. Tracing, Replay)	72.4 KB	+0.02%	+10 B 🔺
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags	62.88 KB	+0.02%	+11 B 🔺
@sentry/browser (incl. Tracing, Replay with Canvas)	76.71 KB	+0.02%	+9 B 🔺
@sentry/browser (incl. Tracing, Replay, Feedback)	89.17 KB	+0.01%	+8 B 🔺
@sentry/browser (incl. Feedback)	39.86 KB	+0.05%	+17 B 🔺
@sentry/browser (incl. sendFeedback)	27.74 KB	+0.07%	+19 B 🔺
@sentry/browser (incl. FeedbackAsync)	32.55 KB	+0.07%	+20 B 🔺
@sentry/react	25.81 KB	+0.06%	+15 B 🔺
@sentry/react (incl. Tracing)	38.41 KB	+0.02%	+6 B 🔺
@sentry/vue	27.26 KB	+0.05%	+12 B 🔺
@sentry/vue (incl. Tracing)	37.31 KB	+0.03%	+10 B 🔺
@sentry/svelte	23.27 KB	+0.08%	+19 B 🔺
CDN Bundle	24.32 KB	+0.02%	+3 B 🔺
CDN Bundle (incl. Tracing)	37.21 KB	+0.04%	+13 B 🔺
CDN Bundle (incl. Tracing, Replay)	72.09 KB	+0.02%	+14 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback)	77.43 KB	+0.02%	+15 B 🔺
CDN Bundle - uncompressed	71.45 KB	+0.01%	+2 B 🔺
CDN Bundle (incl. Tracing) - uncompressed	110.48 KB	-0.03%	-26 B 🔽
CDN Bundle (incl. Tracing, Replay) - uncompressed	223.55 KB	-0.02%	-26 B 🔽
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed	236.77 KB	-0.02%	-26 B 🔽
@sentry/nextjs (client)	38.72 KB	+0.02%	+7 B 🔺
@sentry/sveltekit (client)	36.06 KB	+0.02%	+5 B 🔺
@sentry/node	134.83 KB	-0.19%	-260 B 🔽
@sentry/node - without tracing	96.84 KB	-0.31%	-300 B 🔽
@sentry/aws-serverless	109.16 KB	-0.25%	-275 B 🔽

View base workflow run

codecov · 2024-11-26T12:55:16Z

❌ 6 Tests Failed:

Tests completed	Failed	Passed	Skipped
657	6	651	31

View the top 3 failed tests by shortest run time

errors.test.ts Sends graphql exception to Sentry

Stack Traces | 0.05s run time

errors.test.ts:74:5 Sends graphql exception to Sentry

errors.test.ts Sends unexpected exception to Sentry if thrown in module that was registered before Sentry

Stack Traces | 0.102s run time

errors.test.ts:76:5 Sends unexpected exception to Sentry if thrown in module that was registered before Sentry

errors.test.ts Sends unexpected exception to Sentry if thrown in module with local filter

Stack Traces | 0.149s run time

errors.test.ts:40:5 Sends unexpected exception to Sentry if thrown in module with local filter

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Noticed this while working on #14481. The way we try-catched the astro server request code lead to the http.server span not being attached to servers correctly - we had a try-catch block _outside_ of the `startSpan` call, where we sent caught errors to sentry. but any error caught this way would not have an active span (because by the time the `catch` part triggers, `startSpan` is over), and thus the http.server span would not be attached to the error. By moving this try-catch inside of the `startSpan` call, we can correctly assign the span to errors. I also tried to add some tests to this - there is still a problem in there which the tests show, which I'll look at afterwards (and/or they may get fixed by #14481)

mydea · 2024-11-28T10:04:55Z

packages/opentelemetry/src/utils/enhanceDscWithOpenTelemetryRootSpanName.ts

+    // Also ensure sampling decision is correctly inferred
+    // In core, we use `spanIsSampled`, which just looks at the trace flags
+    // but in OTEL, we use a slightly more complex logic to be able to differntiate between unsampled and deferred sampling
+    if (hasTracingEnabled()) {


this is kind of not exactly what the name implies this does, but I think it's OK to add this here too. Otherwise, we'll have to export another, new method here and use this in the Node SDK initOtel, which seems not really worth it here 🤔

I think we should rename hasTracingEnabled anyway, which we can do during v9

mydea · 2024-11-29T11:59:26Z

dev-packages/e2e-tests/test-applications/nestjs-8/tests/errors.test.ts

@@ -26,6 +26,7 @@ test('Sends exception to Sentry', async ({ baseURL }) => {
  expect(errorEvent.contexts?.trace).toEqual({
    trace_id: expect.stringMatching(/[a-f0-9]{32}/),
    span_id: expect.stringMatching(/[a-f0-9]{16}/),
+    parent_span_id: expect.stringMatching(/[a-f0-9]{16}/),


I do not really know why this was not here before, but IMHO it was incorrect? These errors (same for the other nest E2E tests), as you'll usually have the http.server span and then inside it some e.g. route handler span or so, so the active span at the time of the error should usually have a parent_span_id. So I'd say this fixes incorrect (?) behavior 🤔

mydea · 2024-11-29T11:59:56Z

dev-packages/e2e-tests/test-applications/node-otel-without-tracing/tests/transactions.test.ts

  // But our default node-fetch spans are not emitted
-  expect(scopeSpans.length).toEqual(2);
+  expect(scopeSpans.length).toEqual(3);


This was also incorrect before, spans from startSpan() did not end up here, although they should. This was probably because of incorrect propagation in this scenario.

mydea · 2024-11-29T12:00:12Z

dev-packages/node-integration-tests/suites/aws-serverless/aws-integration/s3/test.ts

@@ -26,6 +26,6 @@ describe('awsIntegration', () => {
  });

  test('should auto-instrument aws-sdk v2 package.', done => {
-    createRunner(__dirname, 'scenario.js').expect({ transaction: EXPECTED_TRANSCATION }).start(done);
+    createRunner(__dirname, 'scenario.js').ignore('event').expect({ transaction: EXPECTED_TRANSCATION }).start(done);


unrelated, but saw this flaking every now and then so decided to just ignore events here.

Lms24 · 2024-12-02T15:31:38Z

packages/opentelemetry/src/utils/enhanceDscWithOpenTelemetryRootSpanName.ts

+    // Also ensure sampling decision is correctly inferred
+    // In core, we use `spanIsSampled`, which just looks at the trace flags
+    // but in OTEL, we use a slightly more complex logic to be able to differntiate between unsampled and deferred sampling
+    if (hasTracingEnabled()) {


I think we should rename hasTracingEnabled anyway, which we can do during v9

mydea self-assigned this Nov 26, 2024

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch from 9e88097 to 4b096ed Compare November 26, 2024 14:57

mydea mentioned this pull request Nov 27, 2024

fix(astro): Fix astro trace propagation issues #14501

Merged

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch from 4b096ed to 0362bd0 Compare November 28, 2024 09:53

mydea commented Nov 28, 2024

View reviewed changes

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch 4 times, most recently from fee464e to dae27c9 Compare November 29, 2024 08:57

mydea commented Nov 29, 2024

View reviewed changes

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch from b7f5861 to c0e8566 Compare November 29, 2024 12:11

mydea marked this pull request as ready for review November 29, 2024 12:11

mydea requested review from Lms24, lforst and AbhiPrasad November 29, 2024 12:11

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch from a2cbf94 to dee24ea Compare November 29, 2024 13:18

mydea added 5 commits December 2, 2024 08:59

feat(opentelemetry): Ignore propagation context for span creation

e4c1b64

fix formatting

f36f74d

be more specific with imports

22f5a1b

better docs

d9eaee4

fix linting

f61a993

mydea force-pushed the fn/ignorePropagationContextOtelSpans branch from dee24ea to f61a993 Compare December 2, 2024 08:05

lforst approved these changes Dec 2, 2024

View reviewed changes

Lms24 approved these changes Dec 2, 2024

View reviewed changes

mydea merged commit c8e81d5 into develop Dec 3, 2024
153 checks passed

mydea deleted the fn/ignorePropagationContextOtelSpans branch December 3, 2024 08:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(opentelemetry): Stop looking at propagation context for span creation #14481

feat(opentelemetry): Stop looking at propagation context for span creation #14481

mydea commented Nov 26, 2024 •

edited

Loading

github-actions bot commented Nov 26, 2024 •

edited

Loading

codecov bot commented Nov 26, 2024 •

edited

Loading

mydea Nov 28, 2024

Lms24 Dec 2, 2024

mydea Nov 29, 2024

mydea Nov 29, 2024

mydea Nov 29, 2024

Lms24 Dec 2, 2024

feat(opentelemetry): Stop looking at propagation context for span creation #14481

feat(opentelemetry): Stop looking at propagation context for span creation #14481

Conversation

mydea commented Nov 26, 2024 • edited Loading

github-actions bot commented Nov 26, 2024 • edited Loading

size-limit report 📦

codecov bot commented Nov 26, 2024 • edited Loading

❌ 6 Tests Failed:

mydea Nov 28, 2024

Choose a reason for hiding this comment

Lms24 Dec 2, 2024

Choose a reason for hiding this comment

mydea Nov 29, 2024

Choose a reason for hiding this comment

mydea Nov 29, 2024

Choose a reason for hiding this comment

mydea Nov 29, 2024

Choose a reason for hiding this comment

Lms24 Dec 2, 2024

Choose a reason for hiding this comment

mydea commented Nov 26, 2024 •

edited

Loading

github-actions bot commented Nov 26, 2024 •

edited

Loading

codecov bot commented Nov 26, 2024 •

edited

Loading