Skip to content

PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format(VCF) files

License

Notifications You must be signed in to change notification settings

BGI-shenzhen/PopLDdecay

Repository files navigation

PopLDdecay

PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files

The PopLDdecay article has been published in Bioinformatics magazine, please cited this article if possible

PMID: 30321304           DOI:10.1093/bioinformatics/bty875

1) Install


The new version will be updated and maintained in hewm2008/PopLDdecay, please click below website to download the latest version

hewm2008/PopLDdecay

Download


Method1 For linux/Unix and macOS
        git clone https://github.com/hewm2008/PopLDdecay.git 
        cd PopLDdecay; chmod 755 configure; ./configure;
        make;
        mv PopLDdecay  bin/;    #     [rm *.o]

Note: If fail to link,try to re-install the libraries zlib

Method2 For linux/Unix and macOS

        tar -zxvf  PopLDdecayXXX.tar.gz
        cd PopLDdecayXXX;
        cd src;
        make ; make clean                            # or [sh make.sh]
        ../bin/PopLDdecay

Note: If fail to link,try to re-install the libraries zlib

2) Example


see more detailed Usage in the Documentation

    1. Calculate LD decay
      # 1)  For gatk VCF file deal , run PopLDdecay  direct
            ./bin/PopLDdecay    -InVCF  SNP.vcf.gz  -OutStat LDdecay   
      # 2)  For plink [.ped .map], chang plink 2 genotype first  2) run PopLDdecay  
            perl bin/mis/plink2genotype.pl    -inPED in.ped -inMAP in.map  -outGenotype out.genotype ;      ./bin/PopLDdecay        -InGenotype out.genotype -OutStat LDdecay 
      # 3)  To Calculate the subgroup GroupA LDdecay in VCF Files   # put GroupA sample name into GroupA_sample.list
            ./bin/PopLDdecay   -InVCF    -OutStat    -SubPop    GroupA_sample.list
    1. draw the Figure
        #    2.1  For one Population
        perl  bin/Plot_OnePop.pl  -inFile   LDdecay.stat.gz  -output  Fig
        #    2.2  For one Population  muti chr          # List Format [chrResultPathWay]
        perl  bin/Plot_OnePop.pl  -inList   Chr.ResultPath.List  -output Fig
        #    2.3  For muti Population                   #  List Format :[Pop.ResultPath  PopID ]
        perl  bin/Plot_MutiPop.pl  -inList  Pop.ResultPath.list  -output Fig
    1. see the result [LDdecay.stat.gz] and [Fig.png Fig.pdf]

3) Introduction


Linkage disequilibrium (LD) decay[1] is the most important and most common analysis in the population resequencing[2]. Special in the self-pollinated crops, the LD decay may not only reveal much about domestication and breed history[3], but also can reveal gene flow phenomenon, selection regions[1].However, to measure the LD decay, it takes too much resources and time by using currently existent software and tools. The LD decay studies also generate extraordinarily large amounts of data to temporary storage when you using the mainstream software "Haploview"[4], the classical LD processing tools. Effective use and analysis to get the LD decay result remains a difficult task for individual researchers. Here, we introduce PopLDdecay, a simple- efficient software for LD decay analysis, which processes the Variant Call Format (VCF)[5] file to produce the LD decay statistics results and plot the LD decay graphs. PopLDdecay is designed to use compressed data files as input or output to save storage space and it facilitates faster and more computationally efficient than the currently existent softwares. This software makes the LD decay pipeline significantly

  • Parameter description
	Usage: PopLDdecay -InVCF  <in.vcf.gz>  -OutStat <out.stat>

		-InVCF       <str>    Input SNP VCF Format
		-InGenotype  <str>    Input SNP Genotype Format
		-OutStat     <str>    OutPut Stat Dist ~ r^2 File

		-SubPop      <str>    SubGroup SampleList of VCFFile [ALLsample]
		-MaxDist     <int>    Max Distance (kb) between two SNP [300]
		-MAF         <float>  Min minor allele frequency filter [0.005]
		-Het         <float>  Max ratio of het allele filter [0.88]
		-Miss        <float>  Max ratio of miss allele filter [0.25]
		-EHH         <str>    To Run EHH Region decay set StartSite [NA]
		-OutFilterSNP         OutPut the final SNP to calculate
		-OutType     <int>    1: R^2 result 2: R^2 & D' result 3:PairWise LD Out[1]
		                      See the Help for more OutType [1-8] details
		
		-help                 Show more help [hewm2008 v3.43]

4) Results


some LD decay images which I draw in the paper before.

5) Discussing


######################swimming in the sky and flying in the sea #############################