Skip to content

ROC and PR curve #179

@ESDeutekom

Description

@ESDeutekom

Dear @pkrusche and team,

I have done an analysis with Deepvariant variants called from a genome in a bottle sample. I did the analysis with hap.py with a singularity pulled docker taken from docker://pkrusche/hap.py on the benchmark giab vcf.

I used the following command (in snakemake rule):

shell: "export HGREF={input.ref_genome}; /opt/hap.py/bin/hap.py {input.truth_vcf} {input.query_vcf} --false-positives {input.confidence_bed} --target-regions {input.target_bed} -r {input.ref_genome} --roc QUAL --roc-filter RefCall -o {params.prefix} -V --engine=vcfeval --engine-vcfeval-template {input.ref_sdf} --threads {threads} --logfile {log}"

I am however confused by the results. I added the option --roc, because this is the only option I could find (not a pr curve option?). However, I found in the documents that precision and recall are calculated, this is also what I see as column names in the output (see first two rows and header) and not roc metrics:

Type | Subtype | Subset | Filter | Genotype | QQ.Field | QQ | METRIC.Recall | METRIC.Precision | ...
INDEL | * | TS_contained | ALL | * | QUAL | 65.300003 | 0.0 | 1.0 | ...
INDEL | * | TS_contained | SEL | * | QUAL | 65.300003 | 0.0 | 1.0 | ...

How is it possible to have a Recall of 0 and Precision of 1? Unless this is just wrongly labelled metrics and should be TPR and FPR and it is supposed to be a ROC plot? Like the flag says. The plot I made also looks like it should be a ROC.

Additionally, if I plot the METRIC.Recall and METRIC.Precision from the roc files, I get a plot that follows a typical ROC form, while if I plot the values as also calculated in happy.md, I get a different plot and one that does look more like a PR curve:
Recall = TRUTH.TP / (TRUTH.TP + TRUTH.FN)
Precision = QUERY.TP / (QUERY.TP + QUERY.FP)

image

Thank you in advance,
Eva

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions