DOC readability and clarity on `permutation_test_score` in userguide and example #30351

StefanieSenger · 2024-11-26T11:21:43Z

Here, I suggest some improvements in clarity in the documentation for permutation_test_score.

changes include:

clarify in Userguide that permutation_test_score can be used on more than on classifiers
mentioning n_jobs after sentence on brute force procedure
readability on some difficult to understand sentences
in example, clear distinguishing between original data and data without permutations (as the term "original data" was used in two different contexts for different things)
add conclusions on results from permutation_test_score in example: what does the result infer for the null hypothesis?
call p_value a proportion instead of a percentage

@lucyleeow, since you have worked on these docs before, would you like to have a look?

github-actions · 2024-11-26T11:22:58Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: b9f9ba6. Link to the linter CI: here}

StefanieSenger · 2024-11-26T11:25:12Z

This doesn't need a changelog entry.

doc/modules/cross_validation.rst

StefanieSenger · 2024-11-26T11:58:27Z

examples/model_selection/plot_permutation_tests_for_classification.py

+# because the permutation always destroys any feature-label dependency present.
+# The score obtained on the randomized data without label permutation in this case
+# though, is very poor. This results in a large p-value, confirming that there was no
+# feature-label dependency in the randomized data before labels where permuted.


Here I'm trying to not use the word "original", because it can be confused with the iris data that is called "original data" in the other parts of the example.

Yeah good point, maybe we could also amend the legend in the plots, which both say 'original', we could change to 'original iris data' and 'original random data' ?

StefanieSenger · 2024-11-26T14:16:22Z

examples/model_selection/plot_permutation_tests_for_classification.py

 # distribution for the null hypothesis which states there is no dependency
 # between the features and labels. An empirical p-value is then calculated as
-# the percentage of permutations for which the score obtained is greater
-# that the score obtained using the original data.
+# the proportion of permutations for which the score obtained by the model trained on


Substituting "percentage" by "proportion", because the range of possible values is from 1/(n_permutations + 1) to 1.0, not 0.0 to 1.0.

lucyleeow · 2024-11-27T23:09:47Z

This is on my to-do, I'll try and get to it soon!

lucyleeow

Sorry this took me so long.

Good pick up on the user guide use of 'classifier' vs 'estimator' (it does sound like permutation_test_score can be use with any estimator that you can score on incl regressor and clusterings)

Some nitpicks

doc/modules/cross_validation.rst

lucyleeow · 2024-12-04T02:48:53Z

doc/modules/cross_validation.rst

+between features and targets (there is no systematic relationship between these two and
+any observed patterns are likely due to random chance) or because the estimator was not


Suggested change

between features and targets (there is no systematic relationship between these two and

any observed patterns are likely due to random chance) or because the estimator was not

between features and targets (i.e., there is no systematic relationship between these two and

any observed patterns are likely due to random chance) or because the estimator was not

this sentence is really long but I am not sure how to improve it. I love bullet points (because i find wall of text difficult) but not sure it would suit here, WDYT?

Yes, let me do an experiment involving bullet points and also highlighting the logic with the "and both" vs. negation with "either one of".

What do you think?

However, I don't manage to render this without a greyish box. Relying on your expertise here! 🙏

lucyleeow · 2024-12-04T02:51:32Z

doc/modules/cross_validation.rst

+latter case, using a more appropriate estimator that is able to utilize the structure in
+the data, would result in a lower p-value.


I think the key thing users may want to know here is how to tell the difference between no structure vs model was not able to use the structure. I don't think there is a 'guaranteed' way to to tell but it sounds like one way to tell would be to try a more suitable model and see if you get a lower p-value. This is essentially what we say here but maybe we could make it more explicit?

lucyleeow · 2024-12-04T06:00:31Z

doc/modules/cross_validation.rst

+the data, would result in a lower p-value.
+
+Cross-validation provides information about how well an estimator generalizes
+by estimating the range of its expected errors. However, an


Not sure about terminology any 'metric' can be used with permutation_test_score right? Does the term 'error' only cover a subset of all metrics, e.g., ones where lower is better?

Oh true, cross_validation and also permutation_test_score should only take metrics with "higher is better". I change errors into scores then. Don't think it was previously wrong though.

examples/model_selection/plot_permutation_tests_for_classification.py

lucyleeow · 2024-12-04T06:18:28Z

examples/model_selection/plot_permutation_tests_for_classification.py

+# because the permutation always destroys any feature-label dependency present.
+# The score obtained on the randomized data without label permutation in this case
+# though, is very poor. This results in a large p-value, confirming that there was no
+# feature-label dependency in the randomized data before labels where permuted.


Yeah good point, maybe we could also amend the legend in the plots, which both say 'original', we could change to 'original iris data' and 'original random data' ?

Co-authored-by: Lucy Liu <[email protected]>

StefanieSenger

Thanks for your review, @lucyleeow!

I have addressed your suggestions. Do you want to have another look?

doc/modules/cross_validation.rst

StefanieSenger · 2024-12-05T13:36:29Z

doc/modules/cross_validation.rst

+between features and targets (there is no systematic relationship between these two and
+any observed patterns are likely due to random chance) or because the estimator was not


Yes, let me do an experiment involving bullet points and also highlighting the logic with the "and both" vs. negation with "either one of".

What do you think?

However, I don't manage to render this without a greyish box. Relying on your expertise here! 🙏

StefanieSenger · 2024-12-05T13:47:42Z

doc/modules/cross_validation.rst

+the data, would result in a lower p-value.
+
+Cross-validation provides information about how well an estimator generalizes
+by estimating the range of its expected errors. However, an


Oh true, cross_validation and also permutation_test_score should only take metrics with "higher is better". I change errors into scores then. Don't think it was previously wrong though.

lucyleeow · 2024-12-10T05:43:26Z

doc/modules/cross_validation.rst

+  - a lack of dependency between features and targets (i.e., there is no systematic
+    relationship and any observed patterns are likely due to random chance)
+  - **or** because the estimator was not able to use the dependency in the data (for
+    instance because it under fit).


I think a blank line at the top may fix it? If still not working, the bullets may need not need to be indented?

Doing both in combination worked. 🎆

examples/model_selection/plot_permutation_tests_for_classification.py

StefanieSenger

Thank you, @lucyleeow! I have implemented your suggestions.
Do you think this PR is now ready to be merged?

examples/model_selection/plot_permutation_tests_for_classification.py

StefanieSenger · 2024-12-13T13:49:38Z

doc/modules/cross_validation.rst

+  - a lack of dependency between features and targets (i.e., there is no systematic
+    relationship and any observed patterns are likely due to random chance)
+  - **or** because the estimator was not able to use the dependency in the data (for
+    instance because it under fit).


Doing both in combination worked. 🎆

lucyleeow

Thank you! This looks great!

doc/modules/cross_validation.rst

Co-authored-by: Lucy Liu <[email protected]>

doc/modules/cross_validation.rst

examples/model_selection/plot_permutation_tests_for_classification.py

Co-authored-by: Adrin Jalali <[email protected]>

StefanieSenger

Thanks, @adrinjalali!
I have submitted your suggestions and answered one with a new idea on how to describe the pvalue param.

doc/modules/cross_validation.rst

glemaitre

LGTM. Thanks @StefanieSenger everything looks good.

I just pushed 2 small tiny nitpicks. I'll mark this PR as auto-merge.

StefanieSenger · 2025-01-13T19:21:03Z

Oh nice, @glemaitre, then let me quickly push another little thing that I had not seen Adrin had approved a few days before.

Edit: Done, can you enable auto-merge again?

…cikit-learn into permutation_test_scores

StefanieSenger added 2 commits November 26, 2024 12:09

DOC clarify docs on

b3bb6d1

add conclusion on result in example

3ec4eea

github-actions bot added module:model_selection Documentation labels Nov 26, 2024

StefanieSenger commented Nov 26, 2024

View reviewed changes

lucyleeow added the No Changelog Needed label Nov 27, 2024

lucyleeow reviewed Dec 4, 2024

View reviewed changes

StefanieSenger and others added 2 commits December 5, 2024 14:13

Apply suggestions from code review

7a5fac2

Co-authored-by: Lucy Liu <[email protected]>

update after review

847452c

StefanieSenger commented Dec 5, 2024

View reviewed changes

fix linting

99bf384

lucyleeow reviewed Dec 10, 2024

View reviewed changes

examples/model_selection/plot_permutation_tests_for_classification.py Show resolved Hide resolved

changes after review

230f528

StefanieSenger commented Dec 13, 2024

View reviewed changes

lucyleeow approved these changes Dec 18, 2024

View reviewed changes

doc/modules/cross_validation.rst Outdated Show resolved Hide resolved

lucyleeow added the Waiting for Second Reviewer First reviewer is done, need a second one! label Dec 18, 2024

StefanieSenger and others added 2 commits December 18, 2024 09:07

Update doc/modules/cross_validation.rst

914aab5

Co-authored-by: Lucy Liu <[email protected]>

Merge branch 'main' into permutation_test_scores

3c234fd

adrinjalali reviewed Jan 8, 2025

View reviewed changes

StefanieSenger and others added 2 commits January 8, 2025 17:05

Apply suggestions from code review

97c6bb8

Co-authored-by: Adrin Jalali <[email protected]>

Apply suggestions from code review

f2f6dce

Co-authored-by: Adrin Jalali <[email protected]>

StefanieSenger commented Jan 8, 2025

View reviewed changes

doc/modules/cross_validation.rst Outdated Show resolved Hide resolved

adrinjalali removed the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 9, 2025

glemaitre self-requested a review January 13, 2025 18:48

nitpick

a36612d

glemaitre approved these changes Jan 13, 2025

View reviewed changes

glemaitre enabled auto-merge (squash) January 13, 2025 18:59

StefanieSenger added 2 commits January 13, 2025 20:25

improve sentence on what p-value output is

8da2e9d

Merge branch 'permutation_test_scores' of github.com:StefanieSenger/s…

b9f9ba6

…cikit-learn into permutation_test_scores

auto-merge was automatically disabled January 13, 2025 19:26
Head branch was pushed to by a user without write access

glemaitre enabled auto-merge (squash) January 13, 2025 19:47

glemaitre merged commit 36ad7b3 into scikit-learn:main Jan 13, 2025
29 checks passed

StefanieSenger deleted the permutation_test_scores branch January 13, 2025 20:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC readability and clarity on `permutation_test_score` in userguide and example #30351

DOC readability and clarity on `permutation_test_score` in userguide and example #30351

StefanieSenger commented Nov 26, 2024 •

edited

Loading

github-actions bot commented Nov 26, 2024 •

edited

Loading

StefanieSenger commented Nov 26, 2024

StefanieSenger Nov 26, 2024

lucyleeow Dec 4, 2024

StefanieSenger Nov 26, 2024 •

edited

Loading

lucyleeow commented Nov 27, 2024

lucyleeow left a comment

lucyleeow Dec 4, 2024

StefanieSenger Dec 5, 2024

lucyleeow Dec 4, 2024

lucyleeow Dec 4, 2024

StefanieSenger Dec 5, 2024

lucyleeow Dec 4, 2024

StefanieSenger left a comment •

edited

Loading

StefanieSenger Dec 5, 2024

StefanieSenger Dec 5, 2024

lucyleeow Dec 10, 2024

StefanieSenger Dec 13, 2024

StefanieSenger left a comment

StefanieSenger Dec 13, 2024

lucyleeow left a comment

StefanieSenger left a comment

glemaitre left a comment

StefanieSenger commented Jan 13, 2025 •

edited

Loading

		between features and targets (there is no systematic relationship between these two and
		any observed patterns are likely due to random chance) or because the estimator was not

		latter case, using a more appropriate estimator that is able to utilize the structure in
		the data, would result in a lower p-value.

DOC readability and clarity on permutation_test_score in userguide and example #30351

DOC readability and clarity on permutation_test_score in userguide and example #30351

Conversation

StefanieSenger commented Nov 26, 2024 • edited Loading

github-actions bot commented Nov 26, 2024 • edited Loading

✔️ Linting Passed

StefanieSenger commented Nov 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanieSenger Nov 26, 2024 • edited Loading

Choose a reason for hiding this comment

lucyleeow commented Nov 27, 2024

lucyleeow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanieSenger left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanieSenger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucyleeow left a comment

Choose a reason for hiding this comment

StefanieSenger left a comment

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

StefanieSenger commented Jan 13, 2025 • edited Loading

DOC readability and clarity on `permutation_test_score` in userguide and example #30351

DOC readability and clarity on `permutation_test_score` in userguide and example #30351

StefanieSenger commented Nov 26, 2024 •

edited

Loading

github-actions bot commented Nov 26, 2024 •

edited

Loading

StefanieSenger Nov 26, 2024 •

edited

Loading

StefanieSenger left a comment •

edited

Loading

StefanieSenger commented Jan 13, 2025 •

edited

Loading