-
-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC readability and clarity on permutation_test_score
in userguide and example
#30351
DOC readability and clarity on permutation_test_score
in userguide and example
#30351
Conversation
This doesn't need a changelog entry. |
# because the permutation always destroys any feature-label dependency present. | ||
# The score obtained on the randomized data without label permutation in this case | ||
# though, is very poor. This results in a large p-value, confirming that there was no | ||
# feature-label dependency in the randomized data before labels where permuted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm trying to not use the word "original", because it can be confused with the iris data that is called "original data" in the other parts of the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point, maybe we could also amend the legend in the plots, which both say 'original', we could change to 'original iris data' and 'original random data' ?
# distribution for the null hypothesis which states there is no dependency | ||
# between the features and labels. An empirical p-value is then calculated as | ||
# the percentage of permutations for which the score obtained is greater | ||
# that the score obtained using the original data. | ||
# the proportion of permutations for which the score obtained by the model trained on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Substituting "percentage" by "proportion", because the range of possible values is from 1/(n_permutations + 1)
to 1.0
, not 0.0
to 1.0
.
This is on my to-do, I'll try and get to it soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this took me so long.
Good pick up on the user guide use of 'classifier' vs 'estimator' (it does sound like permutation_test_score
can be use with any estimator that you can score on incl regressor and clusterings)
Some nitpicks
doc/modules/cross_validation.rst
Outdated
between features and targets (there is no systematic relationship between these two and | ||
any observed patterns are likely due to random chance) or because the estimator was not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
between features and targets (there is no systematic relationship between these two and | |
any observed patterns are likely due to random chance) or because the estimator was not | |
between features and targets (i.e., there is no systematic relationship between these two and | |
any observed patterns are likely due to random chance) or because the estimator was not |
this sentence is really long but I am not sure how to improve it. I love bullet points (because i find wall of text difficult) but not sure it would suit here, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let me do an experiment involving bullet points and also highlighting the logic with the "and both" vs. negation with "either one of".
What do you think?
However, I don't manage to render this without a greyish box. Relying on your expertise here! 🙏
doc/modules/cross_validation.rst
Outdated
latter case, using a more appropriate estimator that is able to utilize the structure in | ||
the data, would result in a lower p-value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the key thing users may want to know here is how to tell the difference between no structure vs model was not able to use the structure. I don't think there is a 'guaranteed' way to to tell but it sounds like one way to tell would be to try a more suitable model and see if you get a lower p-value. This is essentially what we say here but maybe we could make it more explicit?
doc/modules/cross_validation.rst
Outdated
the data, would result in a lower p-value. | ||
|
||
Cross-validation provides information about how well an estimator generalizes | ||
by estimating the range of its expected errors. However, an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about terminology any 'metric' can be used with permutation_test_score
right? Does the term 'error' only cover a subset of all metrics, e.g., ones where lower is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh true, cross_validation
and also permutation_test_score
should only take metrics with "higher is better". I change errors into scores then. Don't think it was previously wrong though.
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
# because the permutation always destroys any feature-label dependency present. | ||
# The score obtained on the randomized data without label permutation in this case | ||
# though, is very poor. This results in a large p-value, confirming that there was no | ||
# feature-label dependency in the randomized data before labels where permuted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point, maybe we could also amend the legend in the plots, which both say 'original', we could change to 'original iris data' and 'original random data' ?
Co-authored-by: Lucy Liu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review, @lucyleeow!
I have addressed your suggestions. Do you want to have another look?
doc/modules/cross_validation.rst
Outdated
between features and targets (there is no systematic relationship between these two and | ||
any observed patterns are likely due to random chance) or because the estimator was not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let me do an experiment involving bullet points and also highlighting the logic with the "and both" vs. negation with "either one of".
What do you think?
However, I don't manage to render this without a greyish box. Relying on your expertise here! 🙏
doc/modules/cross_validation.rst
Outdated
the data, would result in a lower p-value. | ||
|
||
Cross-validation provides information about how well an estimator generalizes | ||
by estimating the range of its expected errors. However, an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh true, cross_validation
and also permutation_test_score
should only take metrics with "higher is better". I change errors into scores then. Don't think it was previously wrong though.
doc/modules/cross_validation.rst
Outdated
- a lack of dependency between features and targets (i.e., there is no systematic | ||
relationship and any observed patterns are likely due to random chance) | ||
- **or** because the estimator was not able to use the dependency in the data (for | ||
instance because it under fit). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a blank line at the top may fix it? If still not working, the bullets may need not need to be indented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing both in combination worked. 🎆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @lucyleeow! I have implemented your suggestions.
Do you think this PR is now ready to be merged?
doc/modules/cross_validation.rst
Outdated
- a lack of dependency between features and targets (i.e., there is no systematic | ||
relationship and any observed patterns are likely due to random chance) | ||
- **or** because the estimator was not able to use the dependency in the data (for | ||
instance because it under fit). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing both in combination worked. 🎆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This looks great!
Co-authored-by: Lucy Liu <[email protected]>
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
examples/model_selection/plot_permutation_tests_for_classification.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Adrin Jalali <[email protected]>
Co-authored-by: Adrin Jalali <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @adrinjalali!
I have submitted your suggestions and answered one with a new idea on how to describe the pvalue
param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @StefanieSenger everything looks good.
I just pushed 2 small tiny nitpicks. I'll mark this PR as auto-merge.
Oh nice, @glemaitre, then let me quickly push another little thing that I had not seen Adrin had approved a few days before. Edit: Done, can you enable auto-merge again? |
…cikit-learn into permutation_test_scores
Head branch was pushed to by a user without write access
Here, I suggest some improvements in clarity in the documentation for
permutation_test_score
.changes include:
permutation_test_score
can be used on more than on classifiersn_jobs
after sentence on brute force procedurepermutation_test_score
in example: what does the result infer for the null hypothesis?@lucyleeow, since you have worked on these docs before, would you like to have a look?