ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149

adrinjalali · 2024-10-25T07:11:38Z

Fixes #29951, Fixes #30133

This moves _xfail_checks out of the estimator tags.

Deprecation: We don't need to deprecate _xfail_checks since we haven't even released the new tags anyway.
Behavior change: check_estimator now doesn't raise a SkipTestWarning for _xfailed_ checks, since they're not skipped. It also doesn't skip them at all, it runs all the checks.
Structured output: check_estimator now returns a report of xfail and skipped tests if it passes, and if it fails, the thrown exception has that same report included in it.

…ctured output from check_estimator

github-actions · 2024-10-25T07:12:54Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 239abb6. Link to the linter CI: here}

adrinjalali · 2024-10-25T07:14:42Z

sklearn/utils/_test_common/instance_generator.py

+        # This doesn't seem to fail anymore?
+        # "check_methods_subset_invariance": ("fails for the decision_function method"),


by writing a test, I noticed this test doesn't actually fail. We could, in principle now easily do this same test for all our estimators, but it would mean running the common tests twice, or to switch from parametrize_with_checks to check_estimator which I don't think is a good idea.

by writing a test, I noticed this test doesn't actually fail.

Maybe we can remove it from the dict then.

I don't think we should run the common tests twice or switch to check_estimator but we could have a maintainer tool to help run this kind of analysis locally from time to time, no?

sklearn/utils/_test_common/instance_generator.py

adrinjalali · 2024-10-25T07:16:22Z

sklearn/utils/estimator_checks.py


-    should_be_marked, reason = _should_be_skipped_or_marked(estimator, check)
-    if not should_be_marked:
+def _maybe_mark(


we had two functions, one marking them as xfail with pytest, another would wrap the test with something that would simply raise SkipTest. Those two are now merged into this function.

sklearn/utils/estimator_checks.py

adrinjalali · 2024-10-25T07:30:49Z

doc/whats_new/upcoming_changes/sklearn.utils/30149.enhancement.rst

@@ -0,0 +1,8 @@
+- :func:`~utils.estimator_checks.check_estimator` now accepts the list of tests that are


This really doesn't feel right as a single chengelog entry @lesteve

😓 🤷 😓

If we think this is crucial, here are the work-arounds I can think of:

do nested bullet points for each aspect

use multiple bullet points and mention the :pr:`30149` in each of the bullet pointes except the last one, since towncrier will add it

To double-check the generated rst file:

towncrier build --version 1.6.0 --draft

We also need to add this to our doc build, so that we can see the rendered version of the changelog on the CI

adrinjalali · 2024-10-25T07:33:03Z

@adam2392 @Charlie-XIAO @ogrisel @glemaitre this is mostly moving _xfail_checks around, and fixing the infrastructure for that.

@thomasjpfan @Micky774 ended up diverging from the initial proposal a tiny bit, since I feel this information is best suited outside the estimator itself, since it's very test specific. It also has the benefit of giving us a nice overview of what's failing in a single spot.

adrinjalali · 2024-10-29T16:55:05Z

Ping since this is a big PR and we kinda need it for the release.

adam2392

Most of the code is removing the tags._xfail_checks, so it wasn't that bad to go through!

I had a few questions, but mostly it LGTM as a minor refactoring of the testing framework.

doc/whats_new/upcoming_changes/sklearn.utils/30149.enhancement.rst

sklearn/utils/_test_common/instance_generator.py

sklearn/utils/estimator_checks.py

ogrisel

This is a partial review with a first pass of feedback. I need to pause for lunch :)

sklearn/utils/_test_common/instance_generator.py

sklearn/utils/estimator_checks.py

sklearn/exceptions.py

sklearn/utils/validation.py

sklearn/utils/estimator_checks.py

adrinjalali · 2024-10-30T11:27:19Z

@ogrisel (#30149 (comment))

I added a script to check for that. I think in a separate PR we can try to find the ones that are actual fails in the whole CI. Right now, on my system / env, this is what the script says:

/home/adrin/Projects/gh/me/scikit-learn/sklearn/utils/_test_common/instance_generator.py:657: SkipTestWarning: Can't instantiate estimator FrozenEstimator
  warnings.warn(msg, SkipTestWarning)
GridSearchCV did not fail expected failures:
  check_supervised_y_2d
GridSearchCV did not fail expected failures:
  check_requires_y_none
HalvingGridSearchCV did not fail expected failures:
  check_estimators_nan_inf
  check_fit2d_1feature
  check_classifiers_one_label_sample_weights
HalvingGridSearchCV did not fail expected failures:
  check_classifiers_one_label_sample_weights
  check_estimators_nan_inf
  check_requires_y_none
  check_supervised_y_2d
  check_fit2d_1feature
HalvingGridSearchCV did not fail expected failures:
  check_requires_y_none
  check_estimators_nan_inf
  check_fit2d_1feature
  check_classifiers_one_label_sample_weights
HalvingGridSearchCV did not fail expected failures:
  check_classifiers_one_label_sample_weights
  check_estimators_nan_inf
  check_requires_y_none
  check_supervised_y_2d
  check_fit2d_1feature
HalvingRandomSearchCV did not fail expected failures:
  check_estimators_nan_inf
  check_fit2d_1feature
  check_classifiers_one_label_sample_weights
HalvingRandomSearchCV did not fail expected failures:
  check_classifiers_one_label_sample_weights
  check_estimators_nan_inf
  check_requires_y_none
  check_supervised_y_2d
  check_fit2d_1feature
HalvingRandomSearchCV did not fail expected failures:
  check_requires_y_none
  check_estimators_nan_inf
  check_fit2d_1feature
  check_classifiers_one_label_sample_weights
HalvingRandomSearchCV did not fail expected failures:
  check_classifiers_one_label_sample_weights
  check_estimators_nan_inf
  check_requires_y_none
  check_supervised_y_2d
  check_fit2d_1feature
KNeighborsClassifier did not fail expected failures:
  check_dataframe_column_names_consistency
KNeighborsRegressor did not fail expected failures:
  check_dataframe_column_names_consistency
WARNING: class label 0 specified in weight is not found
LinearSVC did not fail expected failures:
  check_non_transformer_estimators_n_iter
RandomizedSearchCV did not fail expected failures:
  check_supervised_y_2d
RandomizedSearchCV did not fail expected failures:
  check_requires_y_none
/home/adrin/Projects/gh/me/scikit-learn/sklearn/utils/_test_common/instance_generator.py:657: SkipTestWarning: Can't instantiate estimator SparseCoder
  warnings.warn(msg, SkipTestWarning)
SpectralBiclustering did not fail expected failures:
  check_dont_overwrite_parameters
  check_fit2d_predict1d
  check_estimator_sparse_array
  check_methods_subset_invariance
  check_estimator_sparse_matrix
TunedThresholdClassifierCV did not fail expected failures:
  check_sample_weight_equivalence

sklearn/utils/validation.py

ogrisel

More feedback:

sklearn/utils/tests/test_estimator_checks.py

sklearn/utils/estimator_checks.py

adrinjalali · 2024-11-05T11:11:08Z

@ogrisel @adam2392 @glemaitre this is now green with new API design.

adam2392 · 2024-11-07T13:15:33Z

For my understanding of the changes, is the following what you would do in a unit-test suite?

@parametrize_with_checks(
    [
        A(random_state=12345, n_estimators=10),
        B(random_state=12345, n_estimators=10),
        C(random_state=12345, n_estimators=10),
    ]
)
def test_sklearn_compatible_estimator(estimator, check):
    if check.func.__name__ in [
        # sample weights do not necessarily imply a sample is not used in clustering
        "check_sample_weight_equivalence",
    ]:
        expected_failed_checks={'check_sample_weight_equivalence': 'sample weights not need to be checked'}

    check(estimator, expected_failed_checks=expected_failed_checks)

adrinjalali · 2024-11-07T13:59:31Z

@adam2392 not really, you pass a callable to parametrize_with_checks as this PR is doing in our test_common.py. This is modified version of your code:

def get_expected_failed_checks(estimator):
    if estimator.__class__.__name__ == "A":
        return {"check_sample_weight_equivalence": "sample weights not need to be checked"}
    return {}

@parametrize_with_checks(
    [
        A(random_state=12345, n_estimators=10),
        B(random_state=12345, n_estimators=10),
        C(random_state=12345, n_estimators=10),
    ],
    expected_failed_checks=get_expected_failed_checks,
)
def test_sklearn_compatible_estimator(estimator, check):
    check(estimator)

glemaitre

It looks good to me apart of a couple of nitpicks.

I'm wondering where did we discuss the need of a callback in the past? I can see that this is useful but I don't recall the discussion.

doc/whats_new/upcoming_changes/sklearn.utils/30149.enhancement.rst

glemaitre · 2024-11-07T17:54:52Z

sklearn/exceptions.py

@@ -14,6 +14,7 @@
    "UndefinedMetricWarning",
    "PositiveSpectrumWarning",
    "UnsetMetadataPassedError",
+    "EstimatorCheckFailedWarning",


Do we want to add this exception to the API doc?
To me it has more its place in the developer API whenever we will have one, isn't it?

Added to our api doc, forgot to add it. I don't know how to add it to a separate section with the current setting, but I don't think that's necessary. We would need to have a proper developer API documentation / examples / user guide anyway.

sklearn/utils/_test_common/instance_generator.py

sklearn/utils/estimator_checks.py

sklearn/utils/tests/test_estimator_checks.py

adrinjalali

I'm wondering where did we discuss the need of a callback in the past? I can see that this is useful but I don't recall the discussion.

@glemaitre somehow during the development of this PR, since I ended up allowing check_estimator to not actually fail, it seemed nice to have the callback. However, the design changed after talking to @ogrisel and he had the idea of taking the callback into account even further. But we settled on what you see here as a middle ground, and people who want more control, can use the test generator and have full control over what happens anyway.

adrinjalali · 2024-11-07T19:29:48Z

sklearn/exceptions.py

@@ -14,6 +14,7 @@
    "UndefinedMetricWarning",
    "PositiveSpectrumWarning",
    "UnsetMetadataPassedError",
+    "EstimatorCheckFailedWarning",


Added to our api doc, forgot to add it. I don't know how to add it to a separate section with the current setting, but I don't think that's necessary. We would need to have a proper developer API documentation / examples / user guide anyway.

sklearn/utils/estimator_checks.py

sklearn/utils/tests/test_estimator_checks.py

ogrisel

LGTM!

ogrisel · 2024-11-08T13:41:16Z

sklearn/utils/estimator_checks.py

+        This return value is only present when all tests pass, or the ones failing
+        are expected to fail.
+


This is no longer true.

Suggested change

This return value is only present when all tests pass, or the ones failing

are expected to fail.

adam2392

LGTM! It's a big PR, so I can let someone else take a look? Or we can merge if you're happy with it :p

glemaitre · 2024-11-08T14:53:35Z

The failing test is something that we merge this morning that was using the previous _xfailed_checks. The test would need to be amended.

glemaitre · 2024-11-08T16:29:26Z

Let me backport now to not forget it

…ctured output from check_estimator (#30149)

ENH remove _xfail_checks, pass directly to check runners, return stru…

0fab8be

…ctured output from check_estimator

adrinjalali added 3 commits October 25, 2024 09:24

self review

fcb68de

chengelog

99de2f7

Merge remote-tracking branch 'upstream/main' into xfail_checks

5f81bfe

adrinjalali commented Oct 25, 2024

View reviewed changes

adrinjalali mentioned this pull request Oct 25, 2024

More sensitive sample weight estimator check #30143

Draft

adrinjalali added this to the 1.6 milestone Oct 25, 2024

adrinjalali added the Developer API Third party developer API related label Oct 25, 2024

adrinjalali added 4 commits October 25, 2024 15:12

better changelog?

3fa7971

more coverage

5d7972d

Merge remote-tracking branch 'upstream/main' into xfail_checks

1fc4233

fix test

df0fe85

glemaitre self-requested a review October 29, 2024 09:42

adam2392 self-requested a review October 29, 2024 21:36

adam2392 reviewed Oct 29, 2024

View reviewed changes

adrinjalali added 2 commits October 30, 2024 08:15

Adam's comments

e29ecc5

Merge remote-tracking branch 'upstream/main' into xfail_checks

4057158

ogrisel reviewed Oct 30, 2024

View reviewed changes

add script to check xfailed checks are real

e798afc

adrinjalali added 3 commits October 30, 2024 12:55

Olivier's comments

71d6c6a

Merge remote-tracking branch 'upstream/main' into xfail_checks

31a509b

fix missing change

2810776

ogrisel reviewed Oct 30, 2024

View reviewed changes

sklearn/utils/validation.py Show resolved Hide resolved

ogrisel reviewed Oct 30, 2024

View reviewed changes

change API design of check_estimator

52f929d

adrinjalali added 4 commits October 31, 2024 16:08

Merge remote-tracking branch 'upstream/main' into xfail_checks

812a28d

debug

0dbec35

debug

61d0a15

fix test

c4c0c52

thomasjpfan mentioned this pull request Nov 5, 2024

RFC Expose xfail_checks with a more flexible API #29951

Closed

Merge remote-tracking branch 'upstream/main' into xfail_checks

ee92ff5

adam2392 self-requested a review November 7, 2024 18:09

glemaitre approved these changes Nov 7, 2024

View reviewed changes

adrinjalali commented Nov 7, 2024

View reviewed changes

adrinjalali added 2 commits November 7, 2024 20:44

Guillaume's comments

4e47312

Merge remote-tracking branch 'upstream/main' into xfail_checks

88ad693

ogrisel approved these changes Nov 8, 2024

View reviewed changes

adrinjalali added 2 commits November 8, 2024 15:15

remove old note

07950e4

Merge remote-tracking branch 'upstream/main' into xfail_checks

a1a9830

adam2392 approved these changes Nov 8, 2024

View reviewed changes

fix error

239abb6

adrinjalali enabled auto-merge (squash) November 8, 2024 15:29

adrinjalali merged commit 9012b78 into scikit-learn:main Nov 8, 2024
30 checks passed

glemaitre pushed a commit that referenced this pull request Nov 8, 2024

ENH remove _xfail_checks, pass directly to check runners, return stru…

f0080be

…ctured output from check_estimator (#30149)

adrinjalali deleted the xfail_checks branch November 8, 2024 16:39

This was referenced Nov 12, 2024

[ci] [python-package] more errors from scikit-learn rearranging estimator tags microsoft/LightGBM#6717

Closed

[python-package][R-package] adapt to scikit-learn 1.6 testing changes, pin more packages in R 3.6 CI jobs microsoft/LightGBM#6718

Merged

ogrisel mentioned this pull request Nov 14, 2024

Fix linear svc handling sample weights under class_weight="balanced" #30057

Merged

1 task

flying-sheep mentioned this pull request Dec 17, 2024

Sklearn 1.6 compat scikit-learn-contrib/sklearn-ann#70

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149

ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149

adrinjalali commented Oct 25, 2024 •

edited by glemaitre

Loading

github-actions bot commented Oct 25, 2024 •

edited

Loading

adrinjalali Oct 25, 2024 •

edited by ogrisel

Loading

ogrisel Oct 30, 2024

adrinjalali Oct 25, 2024

adrinjalali Oct 25, 2024

lesteve Oct 25, 2024 •

edited

Loading

adrinjalali Oct 25, 2024

adrinjalali commented Oct 25, 2024

adrinjalali commented Oct 29, 2024

adam2392 left a comment

ogrisel left a comment

adrinjalali commented Oct 30, 2024

ogrisel left a comment

adrinjalali commented Nov 5, 2024

adam2392 commented Nov 7, 2024

adrinjalali commented Nov 7, 2024

glemaitre left a comment

glemaitre Nov 7, 2024

adrinjalali Nov 7, 2024

adrinjalali left a comment

adrinjalali Nov 7, 2024

ogrisel left a comment

ogrisel Nov 8, 2024

adam2392 left a comment

glemaitre commented Nov 8, 2024

glemaitre commented Nov 8, 2024

		# This doesn't seem to fail anymore?
		# "check_methods_subset_invariance": ("fails for the decision_function method"),

		@@ -0,0 +1,8 @@
		- :func:`~utils.estimator_checks.check_estimator` now accepts the list of tests that are

		This return value is only present when all tests pass, or the ones failing
		are expected to fail.

ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149

ENH remove _xfail_checks, pass directly to check runners, return structured output from check_estimator #30149

Conversation

adrinjalali commented Oct 25, 2024 • edited by glemaitre Loading

github-actions bot commented Oct 25, 2024 • edited Loading

✔️ Linting Passed

adrinjalali Oct 25, 2024 • edited by ogrisel Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lesteve Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrinjalali commented Oct 25, 2024

adrinjalali commented Oct 29, 2024

adam2392 left a comment

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

adrinjalali commented Oct 30, 2024

ogrisel left a comment

Choose a reason for hiding this comment

adrinjalali commented Nov 5, 2024

adam2392 commented Nov 7, 2024

adrinjalali commented Nov 7, 2024

glemaitre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrinjalali left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adam2392 left a comment

Choose a reason for hiding this comment

glemaitre commented Nov 8, 2024

glemaitre commented Nov 8, 2024

adrinjalali commented Oct 25, 2024 •

edited by glemaitre

Loading

github-actions bot commented Oct 25, 2024 •

edited

Loading

adrinjalali Oct 25, 2024 •

edited by ogrisel

Loading

lesteve Oct 25, 2024 •

edited

Loading