-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcftools merge --info-rules doesn't seem to work properly #1394
Comments
Ah, okay, got it, it's probably getting 4 by actually counting the alleles as observed in Test dataThose are two summary VCFs without genotypes (permitted by the spec). tr '|' '\t' << EOF | bgzip > C.vcf.gz
##fileformat=VCFv4.3
##INFO=<ID=AN,Number=1,Type=Integer,Description="Some metric">
#CHROM|POS|ID|REF|ALT|QUAL|FILTER|INFO
chr1|123|.|A|C|.|.|AN=30
EOF
tr '|' '\t' << EOF | bgzip > D.vcf.gz
##fileformat=VCFv4.3
##INFO=<ID=AN,Number=1,Type=Integer,Description="Some metric">
#CHROM|POS|ID|REF|ALT|QUAL|FILTER|INFO
chr1|123|.|A|C|.|.|AN=12
EOF
parallel tabix ::: *.vcf.gz Merge./bcftools merge --info-rules 'AN:sum' C.vcf.gz D.vcf.gz Output
Notice how this drops the AN field entirely. Is this intended behaviour? |
Yes, this is deliberate, as AC and AN are computed from the sample genotypes. Is there a strong motivation for keeping them? |
Sorry, pressed the wrong button. I will mark this as a feature request: in case merge rules are given explicitly for AN, the program could use that instead of its default behavior |
Hi Petr! Sorry I didn't get back to you back in February. To give some context (better late than never), this report arose from the CINECA project. The workflow we were using went like this:
In CINECA we managed to get around this by temporarily renaming the fields, so it's definitely not urgent or anything. Treating this as a low priority feature request seems to be the most sensible option to me. |
I am running into the same issue when merging gnomAD exomes and genomes VCFs which do not contain sample information. Our use case is to create a single VCF containing the summed AC and AN. I came up with the same merge command as the author of this issue which didn't give the expected result. @pd3 an alternative to the feature request you mention could be to change the default behavior in case of no available samples such that |
This is now supported. When no genotypes are present, AC,AN will be summed. When they are present, the values will be recalculated unless |
or when bcftools merge -i AC:sum,AN:sum is given explicitly Resolves #1394
Hi, maybe I'm doing something wrong, but I can't seem to make
bcftools merge --info-rules
flag to behave as I want it to. Here's what I'm doing:Build bcftools v1.11
Prepare test data
Try to combine (summarise) a particular INFO field
./bcftools merge --info-rules 'AN:sum' A.vcf.gz B.vcf.gz
Output
I expect the AN to be 42 (30 + 12), but it's for some reason... 4. Am I missing something?
The text was updated successfully, but these errors were encountered: