Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measure$score has weird NaN producing code #1110

Closed
berndbischl opened this issue Aug 23, 2024 · 6 comments
Closed

Measure$score has weird NaN producing code #1110

berndbischl opened this issue Aug 23, 2024 · 6 comments

Comments

@berndbischl
Copy link
Member

  if (is.null(prediction) && length(measure$predict_sets)) {
    return(NaN)
  }

we have this above. this not documented and weird

@zhonghua723
Copy link

zhonghua723 commented Dec 8, 2024

i have encountered the same question many times. please help me address it. thanks.

tab_rsmp
#>       nr task_id     learner_id resampling_id iters classif.auc classif.acc classif.tpr classif.tnr classif.ppv classif.npv
#>    <int>  <char>         <char>        <char> <int>       <num>       <num>       <num>       <num>       <num>       <num>
#> 1:     1       1 classif.glmnet       holdout     1   0.9333333   0.9615385   0.9333333   1.0000000   1.0000000   0.9166667
#> 2:     2       1 classif.glmnet   repeated_cv   100   0.9901825   0.9498611   0.9226667   0.9662500         NaN   0.9461071
#> 3:     3       1 classif.glmnet   subsampling  1000   0.9899363   0.9455000   0.9289839   0.9604998   0.9529729   0.9413482
#> 4:     4       1 classif.glmnet     bootstrap  1000   0.9868122   0.9397969   0.9250875   0.9536506   0.9440450   0.9381008
#> Hidden columns: resample_result

code:

resamplings <- list(
  holdout_70 = rsmp("holdout", ratio = 0.7),
  repeated_cv_10_10 = rsmp("repeated_cv", folds = 10, repeats = 10),
  subsampling_70 = rsmp("subsampling", ratio = 0.7, repeats = 1000),
  bootstrap_1000 = rsmp("bootstrap", repeats = 1000, ratio = 0.7)
)

design_rsmp <- benchmark_grid(
  tasks = task_model,             
  learners = lrns('classif.glmnet', predict_type = "prob"),          
  resamplings = resamplings      
)

bmr_rsmp <- benchmark(design_rsmp, store_models = T)
measures <- msrs(c("classif.auc", "classif.acc", "classif.tpr", "classif.tnr", "classif.ppv", "classif.npv"))
tab_rsmp <- bmr_rsmp$aggregate(measures)

@berndbischl
Copy link
Member Author

@zhonghua723 what you posted is likely another issue. could you please open another issue and please (!) post a fully reproducible example

@mb706
Copy link
Collaborator

mb706 commented Dec 19, 2024

(in particular, we would need the task_model dataset in this example, and make sure you run set.seed() as well.)

@mllg mllg self-assigned this Dec 19, 2024
@berndbischl
Copy link
Member Author

ok, lets add warning, docs and tests

@mllg
Copy link
Member

mllg commented Dec 19, 2024

Warnings where already in place in assert_measure() if Measure$check_prerequisites is set to "warn" (default). The warning will only be triggered once per resample()/benchmark() and measure (in contrast to what we agreed on here) which is IMHO the nicer behavior.

@mllg mllg closed this as completed Dec 19, 2024
@mllg
Copy link
Member

mllg commented Dec 19, 2024

Note that this is documented in the field of Measure

    #' @field check_prerequisites (`character(1)`)\cr
    #' How to proceed if one of the following prerequisites is not met:
    #'
    #' * wrong predict type (e.g., probabilities required, but only labels available).
    #' * wrong predict set (e.g., learner predicted on training set, but predictions of test set required).
    #' * task properties not satisfied (e.g., binary classification measure on multiclass task).
    #'
    #' Possible values are `"ignore"` (just return `NaN`) and `"warn"` (default, raise a warning before returning `NaN`).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants