This repository provides a Nextflow pipeline for calling somatic point mutations from tumor/normal pairs using Whole Genome Sequencing (WGS) or Exome data.
To run the pipeline, use the following command:
nextflow run digenoma-lab/somatic_point_mutations -r v1.1 --tn test.csv -params-file strelka-params.yml -profile kutral
Prepare a CSV file indicating the paths to CRAM or BAM files, including index and optional manta_indel VCF files. The CSV file should follow this format:
The somatic_point_mutations
pipeline has several required and optional arguments.
: CSV file with tumor/normal pairs.--fasta
: Reference genome file in FASTA format.--fai
: Reference genome index file in FAI format.
: Directory for Nextflow results. Default:./results
: Set if the data is Exome rather than WGS. Default:false
: Target regions for Strelka in BED format for hg38. Default:/somatic_point_mutations/auxfiles/hg38.bed.gz
: Index for target BED regions. Default:/somatic_point_mutations/auxfiles/hg38.bed.gz.tbi
: Path
executable. Default:/annovar/annovar/
: Path to Annovar database for hg38. Default:/databases/annovar/hg38
: Databases included in Annovar analysis. Default:ensGene,clinvar_20220320,revel,dbnsfp42c,gnomad30_genome,avsnp150,icgc28
: Operations according to Annovar selected databases. Default:g,f,f,f,f,f,f
nextflow run digenoma-lab/somatic_point_mutations -r v1.1 \
--tn test.csv \
--fasta /path/to/reference.fasta \
--fai /path/to/reference.fasta.fai \
--outdir ./results \
--exome true \
--target_bed /path/to/target.bed.gz \
--target_bed_index /path/to/target.bed.gz.tbi \
--annovar_bin /path/to/annovar/ \
--annovar_bd /path/to/annovar/hg38 \
--annovar_protocol ensGene,clinvar_20220320,revel,dbnsfp42c,gnomad30_genome,avsnp150,icgc28 \
--annovar_operation g,f,f,f,f,f,f
-profile kutral
If you encounter any issues or have questions, please open an issue on the GitHub repository.