-
-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149
Conversation
…ctured output from check_estimator
# This doesn't seem to fail anymore? | ||
# "check_methods_subset_invariance": ("fails for the decision_function method"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by writing a test, I noticed this test doesn't actually fail. We could, in principle now easily do this same test for all our estimators, but it would mean running the common tests twice, or to switch from parametrize_with_checks
to check_estimator
which I don't think is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by writing a test, I noticed this test doesn't actually fail.
Maybe we can remove it from the dict then.
I don't think we should run the common tests twice or switch to check_estimator
but we could have a maintainer tool to help run this kind of analysis locally from time to time, no?
|
||
should_be_marked, reason = _should_be_skipped_or_marked(estimator, check) | ||
if not should_be_marked: | ||
def _maybe_mark( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we had two functions, one marking them as xfail
with pytest, another would wrap the test with something that would simply raise SkipTest
. Those two are now merged into this function.
@@ -0,0 +1,8 @@ | |||
- :func:`~utils.estimator_checks.check_estimator` now accepts the list of tests that are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really doesn't feel right as a single chengelog entry @lesteve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😓 🤷 😓
If we think this is crucial, here are the work-arounds I can think of:
- do nested bullet points for each aspect
- use multiple bullet points and mention the
:pr:`30149`
in each of the bullet pointes except the last one, since towncrier will add it
To double-check the generated rst file:
towncrier build --version 1.6.0 --draft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to add this to our doc build, so that we can see the rendered version of the changelog on the CI
@adam2392 @Charlie-XIAO @ogrisel @glemaitre this is mostly moving @thomasjpfan @Micky774 ended up diverging from the initial proposal a tiny bit, since I feel this information is best suited outside the estimator itself, since it's very test specific. It also has the benefit of giving us a nice overview of what's failing in a single spot. |
Ping since this is a big PR and we kinda need it for the release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the code is removing the tags._xfail_checks, so it wasn't that bad to go through!
I had a few questions, but mostly it LGTM as a minor refactoring of the testing framework.
doc/whats_new/upcoming_changes/sklearn.utils/30149.enhancement.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review with a first pass of feedback. I need to pause for lunch :)
I added a script to check for that. I think in a separate PR we can try to find the ones that are actual fails in the whole CI. Right now, on my system / env, this is what the script says:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More feedback:
@ogrisel @adam2392 @glemaitre this is now green with new API design. |
For my understanding of the changes, is the following what you would do in a unit-test suite?
|
@adam2392 not really, you pass a callable to def get_expected_failed_checks(estimator):
if estimator.__class__.__name__ == "A":
return {"check_sample_weight_equivalence": "sample weights not need to be checked"}
return {}
@parametrize_with_checks(
[
A(random_state=12345, n_estimators=10),
B(random_state=12345, n_estimators=10),
C(random_state=12345, n_estimators=10),
],
expected_failed_checks=get_expected_failed_checks,
)
def test_sklearn_compatible_estimator(estimator, check):
check(estimator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me apart of a couple of nitpicks.
I'm wondering where did we discuss the need of a callback in the past? I can see that this is useful but I don't recall the discussion.
doc/whats_new/upcoming_changes/sklearn.utils/30149.enhancement.rst
Outdated
Show resolved
Hide resolved
@@ -14,6 +14,7 @@ | |||
"UndefinedMetricWarning", | |||
"PositiveSpectrumWarning", | |||
"UnsetMetadataPassedError", | |||
"EstimatorCheckFailedWarning", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add this exception to the API doc?
To me it has more its place in the developer API whenever we will have one, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to our api doc, forgot to add it. I don't know how to add it to a separate section with the current setting, but I don't think that's necessary. We would need to have a proper developer API documentation / examples / user guide anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering where did we discuss the need of a callback in the past? I can see that this is useful but I don't recall the discussion.
@glemaitre somehow during the development of this PR, since I ended up allowing check_estimator
to not actually fail, it seemed nice to have the callback. However, the design changed after talking to @ogrisel and he had the idea of taking the callback into account even further. But we settled on what you see here as a middle ground, and people who want more control, can use the test generator and have full control over what happens anyway.
@@ -14,6 +14,7 @@ | |||
"UndefinedMetricWarning", | |||
"PositiveSpectrumWarning", | |||
"UnsetMetadataPassedError", | |||
"EstimatorCheckFailedWarning", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to our api doc, forgot to add it. I don't know how to add it to a separate section with the current setting, but I don't think that's necessary. We would need to have a proper developer API documentation / examples / user guide anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
sklearn/utils/estimator_checks.py
Outdated
This return value is only present when all tests pass, or the ones failing | ||
are expected to fail. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer true.
This return value is only present when all tests pass, or the ones failing | |
are expected to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! It's a big PR, so I can let someone else take a look? Or we can merge if you're happy with it :p
The failing test is something that we merge this morning that was using the previous |
Let me backport now to not forget it |
…ctured output from check_estimator (#30149)
Fixes #29951, Fixes #30133
This moves
_xfail_checks
out of the estimator tags._xfail_checks
since we haven't even released the new tags anyway.check_estimator
now doesn't raise aSkipTestWarning
for_xfailed_ checks
, since they're not skipped. It also doesn't skip them at all, it runs all the checks.check_estimator
now returns a report ofxfail
andskipped
tests if it passes, and if it fails, the thrown exception has that same report included in it.