Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC amend doc and examples to show how to switch classes in PR and ROC curves #30311

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ThorbenMaa
Copy link

Reference Issues/PRs

Fixes issue raised in #26758

What does this implement/fix? Explain your changes.

Clearified documentation and added example for the case where it is necessary to change positive and negative classes when calculating roc_curve and precision_recall_curve

Copy link

github-actions bot commented Nov 20, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: a6b5aa8. Link to the linter CI: here

@glemaitre
Copy link
Member

@ThorbenMaa Can you make sure to respect the linter and provide a proper title to the PR.
It will help at getting reviewer attention.

@ThorbenMaa ThorbenMaa changed the title update docu according to https://github.com/scikit-learn/scikit-learn/issues/26758#top Update docu and examples to show how to switch classes in precision recall and RO curves Nov 21, 2024
@ThorbenMaa
Copy link
Author

sorry, now it should be good to go.

cc: @lucyleeow

Copy link
Member

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to make this valuable addition, and for your patience!

I think we could add a sentence about this in the user guide; in the precision recall section and ROC section. These docs are in the model_evaluation.rst file.

I think we could also (in the user guide or example if we add one) spell out exactly what we mean by the prediction values being positively correlated (i.e. bigger values means more likely to be 'positive' class) and negatively correlated - just to be friendly to newcomers. Maybe even give an example of the type of computation biology problem where this often occurs?

Comment on lines 947 to 949
Note that this means that the positive class must be positively
correlated with score. See examples for a case where the negative
class is positively correlated with score.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go in the main docstring, maybe at the end ~line 904?

I think it's important information and the user shouldn't need to scroll to the 'Returns' section to see it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to the "Binary classification " chapter as this is where things are defined. Do you think this makes sense?

Comment on lines 974 to 975
>>> # positive class (y_true = 1) is positively correlated with y_score and negative
>>> # class is negatively correlated with y_score
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed, if we've already said in the main docstring that the 'default' assumption is that the positive class is positively correlated with y_score

Comment on lines 989 to 997
>>> # negative class (y_true = 1) is positively correlated with y_score and positive
>>> # class is negatively correlated with y_score
>>> import numpy as np
>>> from sklearn.metrics import precision_recall_curve
>>> y_true = np.array([0, 0, 1, 1])
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> y_scores_adj = y_scores * (-1)
>>> precision, recall, thresholds = precision_recall_curve(
... y_true, y_scores_adj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if it would be worthwhile adding an example about how to deal with predicted probabilities and decision function values that are negatively correlated (instead of adding it here). We would then add a link to this example in the main docstring.

I think it would be nice to spell everything out more completely, which I don't think you can do in these example sections as they are meant to be succinct.

WDYT @glemaitre ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let this one open for now. But I'm happy to come up with an example. Can you tell me the exact place to put it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what @glemaitre says. I am now wondering if explaining it in the user guide would be better...

@@ -1088,10 +1106,16 @@ def roc_curve(
fpr : ndarray of shape (>2,)
Increasing false positive rates such that element i is the false
positive rate of predictions with score >= `thresholds[i]`.
Note that this means that the positive class must be positively
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, it could go in the main docstring.

@glemaitre glemaitre changed the title Update docu and examples to show how to switch classes in precision recall and RO curves DOC amend doc and examples to show how to switch classes in PR and ROC curves Nov 27, 2024
@glemaitre
Copy link
Member

A couple of thoughts here:

  • PrecisionRecallCurveDisplay has the advantage to call _get_response_values and make the transform as expected and pass the pos_label. So we should encourage using this class.
  • when it comes to precision_recall_curve:
    • we need to amend the y_score docstring: we need to emphasize the "positive class" part for the probability and mention that when passing the decision_function, we expect that the sign of the decision function is aligned with the positive class as well (so multiplying by -1 in case of flipping the positive class).
    • I don't think that we should make the docstring of the function more complex. However, we should improve in the user guide. I'm almost thinking that we should have a small public utility function that could post-process the output of decision function of predict-proba. But right now we could explain this in the documentation.

@glemaitre
Copy link
Member

Somehow, I would expect the following function:

def post_process_classification_scores(y_scores, response_method, classes, pos_label=None):
    """Post-process predictions from classifiers.

    This function is used to post-process predictions from classifiers.
    It is used to convert the predictions to take into account the positive class and
    compatible with the metrics.

    .. versionadded:: 1.7

    Parameters
    ----------
    y_scores : ndarray of shape (n_samples, n_classes) or (n_samples,)
        The predictions to post-process. It could be the output of
        :term:`predict_proba`, :term:`predict_log_proba`, or
        :term:`decision_function`.

    response_method : {"predict_proba", "predict_log_proba", "decision_function"}
        The response method to use to post-process the predictions.

    classes : ndarray of shape (n_classes,)
        The classes of the classifier.

    pos_label : int, float, bool or str, default=None
        The positive label.

    Returns
    -------
    y_scores : ndarray of shape (n_samples,), (n_samples, n_classes) or \
            (n_samples, n_output)
        Compressed predictions format as requested by the metrics.
    """
    target_type = type_of_target(classes)

    if target_type in ("binary", "multiclass"):
        if pos_label is not None and pos_label not in classes.tolist():
            raise ValueError(
                f"pos_label={pos_label} is not a valid label: It should be "
                f"one of {classes}"
            )
        elif pos_label is None and target_type == "binary":
            pos_label = classes[-1]

    if response_method in ("predict_proba", "predict_log_proba"):
        y_scores = _process_predict_proba(
            y_pred=y_scores,
            target_type=target_type,
            classes=classes,
            pos_label=pos_label,
        )
    elif response_method == "decision_function":
        y_scores = _process_decision_function(
            y_pred=y_scores,
            target_type=target_type,
            classes=classes,
            pos_label=pos_label,
        )

    return y_scores

But I don't know if such utility make sense to live in the scikit-learn codebase. Somehow, it could leave in the metrics module but it really feel that we did something wrong.

@ThorbenMaa ThorbenMaa reopened this Dec 9, 2024
@ThorbenMaa
Copy link
Author

Thanks for your comment and your thoughts! I reset everything I did before to re-start out clean.

PrecisionRecallCurveDisplay has the advantage to call _get_response_values and make the transform as expected and pass the pos_label. So we should encourage using this class

Where would you adress this?

we need to amend the y_score docstring: we need to emphasize the "positive class" part for the probability and mention that when passing the decision_function, we expect that the sign of the decision function is aligned with the positive class as well (so multiplying by -1 in case of flipping the positive class).

I updated the y_score field docu accordingly. I hope I understood this correctly? I'm very open for suggestions :-)

Copy link
Member

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you adress this?

I think in an example, that can be separate from this PR. This is already an improvement, we can get this in and think about adding an example in a separate PR.

I would note that we can only encourage use of the display classes when the user is wanting to plot. If they are just calculating metrics they would need to provide the right score.

But I don't know if such utility make sense to live in the scikit-learn codebase. Somehow, it could leave in the metrics module but it really feel that we did something wrong.

@glemaitre I have similar feelings on including this. For the multiclass case, would it be OvR that we are supporting in the function?

Comment on lines 148 to 149
For y_scores from "decision_function", positive scores are required to
correspond to the positive class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets link to decision_function in the glossary. Nitpick, what do you two think of something like:

Suggested change
For y_scores from "decision_function", positive scores are required to
correspond to the positive class.
For :term:`decision_function` scores, values strictly greater than zero should
indicate the positive class.

Copy link
Author

@ThorbenMaa ThorbenMaa Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds almost perfect to me :-). A decision function of zero would correspond to a probability of 0.5 in the logistic regression setting. To my understanding, 0.5 would be assigned to the positive class. Please correct me if I'm wrong. So I changed it to
For :term:decision_function scores, values greater than or equal to zero should indicate the positive class.

Copy link
Author

@ThorbenMaa ThorbenMaa Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not important, just FYI: there is this discussion going on for decision hyperplanes of SVMs (https://www.researchgate.net/post/In_SVM_what_happens_when_test_data_has_a_point_on_the_hyperplane_How_does_it_categorise_it). Apparently it is not entirely clear what to do with zero in general.

If I understand metrics correctly, I think datapoints in any decision hyperplane would also be assigned to the positive class. So >= 0 would correspond to the positive class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants