feat: Extend ErrorTracker with error grouping#1014
Merged
Conversation
TODO: Message matching and replacing with wildcards Extra args - full stack and full message Tests
Todo: Add get_most_popular_errors Add extended configuration parameters
TODO: Add extended configuration parameters
Do not port option for full trace - not used in JS version.
janbuchar
approved these changes
Mar 4, 2025
Collaborator
janbuchar
left a comment
There was a problem hiding this comment.
Looks alright, just some minor nits
| # No similar message found. Create new group. | ||
| self._errors[error_group_stack_trace][error_group_name].update([error_group_message]) | ||
|
|
||
| def _get_traceback_text(self, error: Exception) -> str | None: |
Collaborator
There was a problem hiding this comment.
This is just the file and line number where the exception was thrown, isn't it? The name would suggest a complete stacktrace. I'd try and come up with something more fitting.
Comment on lines
+11
to
+12
| ErrorMessageGroups = Counter[GroupName] | ||
| ErrorTypeGroups = dict[GroupName, ErrorMessageGroups] |
Collaborator
There was a problem hiding this comment.
I think that inlining these two definitions would actually make things more readable.
| def _create_generic_message(message_1: str | None, message_2: str | None) -> str | None: | ||
| """Create a generic error message from two messages, if they are similar enough. | ||
|
|
||
| Different parts of similar messages are replaced by `_`. |
Collaborator
There was a problem hiding this comment.
I'm kinda surprised by the choice of _ here - I think it might get mixed up with legit underscores in python error messages.
Collaborator
Author
There was a problem hiding this comment.
Maybe *** will be better in Python context
janbuchar
approved these changes
Mar 5, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds error grouping to
ErrorTracker.ErrorTrackercan be configured to use different grouping options.Based on JS implementation https://github.com/apify/crawlee/blob/master/packages/core/src/crawlers/error_tracker.ts#L286
Differences from JS implementation:
showErrorCodeoption not migrated -> does not make sense in Python.showFullStackoption not migrated -> is not used in JS, no point migrating unused stuff.getMostPopularErrorsmigrated under different nameget_most_common_errorsIssues
Partially implements: #151
(
ErrorSnapshotterwill be in separate PR)