-
-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate warnings to all workers in joblib #30380
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the following are addressed.
@@ -0,0 +1,2 @@ | |||
- Filter warnings from the main process are propagated to joblib workers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Filter warnings from the main process are propagated to joblib workers. | |
- Warning filters from the main process are propagated to joblib workers. |
sklearn/utils/parallel.py
Outdated
previous_filters = warnings.filters | ||
warnings.filters = warning_filters | ||
yield | ||
warnings.filters = previous_filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the threading backend, this should result in a null-operation, but since the warnings
module is famously not thread-safe, I would rather test this case explicitly as part of our test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a more explicit check here: 06aa344
(#30380)
sklearn/utils/parallel.py
Outdated
with ( | ||
config_context(**config), | ||
_warning_filter_context(warning_filters=warning_filters), | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than creating our own context manager, maybe it would be slightly simpler to use warnings.catch_warnings
and set warning.filters = warning_filters
, i.e. something like this:
with config_context(**config), warnings.catch_warnings():
warnings.filters = warnings_filters
return self.function(*args, **kwargs)
sklearn/utils/parallel.py
Outdated
@@ -21,10 +22,10 @@ | |||
_threadpool_controller = None | |||
|
|||
|
|||
def _with_config(delayed_func, config): | |||
def _with_config(delayed_func, config, warning_filters): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe renaming the function _with_config_and_warning_filters
to be explicit?
Maybe it is worth to add a test that makes sure that for def test_filter_warning_propagates_no_side_effect_with_loky_backend():
with warnings.catch_warnings():
warnings.simplefilter("error", category=ConvergenceWarning)
Parallel(n_jobs=2, backend="loky")(
delayed(time.sleep)(0) for i in range(10)
)
# Make sure that inside the loky workers, warnings filters have been reset to their original
# value. Using joblib directly should not turn ConvergenceWarning turned into an error
joblib.Parallel(n_jobs=2, backend="loky")(
joblib.delayed(warnings.warn)("Convergence warning", ConvergenceWarning) for _ in range(10)
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM!
I push a small tweak and set this to automerge.
Reference Issues/PRs
Closes #29294
What does this implement/fix? Explain your changes.
This PR sets the filter warnings from the main process for each joblib worker. It adjusts the context in the same place as the global context.