Skip to content
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Expand motivation with answers to common questions
Added two new sections to address questions raised at core team sprint:

1. "Why Exception Groups Need Timestamps" - Explains that while exception
   groups are conceptually unrelated, in practice they have important
   temporal relationships for debugging causality, performance analysis,
   and correlation with external observability tools.

2. "Why Not Use .add_note() When Catching?" - Details six key drawbacks
   of using add_note() instead: not all exceptions are caught, timing
   accuracy issues, inconsistent application, performance overhead,
   complexity burden, and loss of original timing information.

Key insight: When an exception occurs is intrinsic, immutable information
that should be captured at the source, not added later by consumers.

🤖 Generated with Claude Code (https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
  • Loading branch information
gpshead and claude committed Sep 18, 2025
commit 47d0b7878caa81940a03abe46c6db31d2486d5f1
65 changes: 65 additions & 0 deletions peps/pep-08XX.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,71 @@ Without timestamps, we would only know that four services failed, but not
their temporal relationship, making root cause analysis significantly harder.


Why Exception Groups Need Timestamps
-------------------------------------

While exception groups are conceptually "unrelated" exceptions that happen to be
raised together, in practice they often have important temporal relationships:

1. **Causality isn't always explicit**: When multiple services fail in sequence,
one failure might trigger cascading failures in seemingly unrelated services.
Without timestamps, these cascade patterns are invisible. For example, a
database connection pool exhaustion might cause multiple "unrelated" query
failures across different services.

2. **Concurrent doesn't mean simultaneous**: Tasks in an exception group may
start concurrently but fail at very different times. A service that fails
after 100ms versus one that fails after 5 seconds tells a different story
about what went wrong - the first might be a validation error, the second
a timeout.

3. **Debugging distributed systems**: In microservice architectures, exception
groups often collect failures from multiple remote services. Timestamps allow
correlation with external observability tools (logs, metrics, traces) that
are essential for understanding the full picture.

4. **Performance analysis**: Even for "unrelated" exceptions, knowing their
temporal distribution helps identify performance bottlenecks and timeout
configurations that need adjustment.


Why Not Use ``.add_note()`` When Catching?
--------------------------------------------

A common question is why we don't simply use :pep:`678`'s ``.add_note()`` to add
timestamps when exceptions are caught and grouped. This approach has several
significant drawbacks:

1. **Not all exceptions are caught**: Exceptions that propagate to the top level
or are logged directly never get the opportunity to have notes added. The
timestamp of when an error occurred is lost forever.

2. **Timing accuracy**: Adding a note when catching introduces variable delay.
The timestamp would reflect when the exception was caught and processed, not
when it actually occurred. In async code with complex exception handling,
this delay can be significant and misleading.

3. **Inconsistent application**: Relying on exception handlers to add timestamps
means some exceptions get timestamps and others don't, depending on code
paths. This inconsistency makes debugging harder, not easier.

4. **Performance overhead**: Creating note strings for every caught exception
adds overhead even when timestamps aren't being displayed. With the proposed
approach, formatting only happens when tracebacks are rendered.

5. **Complexity burden**: Every exception handler that wants timing information
would need to remember to add notes. This is error-prone and adds boilerplate
to exception handling code.

6. **Lost original timing**: By the time an exception is caught, the original
failure moment is lost. In retry loops or complex error handling, the catch
point might be seconds or minutes after the actual error.

The key insight is that **when** an exception is created is intrinsic, immutable
information about that exception - just like its type and message. This information
should be captured at the source, not added later by consumers.


Rationale
=========

Expand Down