*Last updated: Jan 7, 2019
FunGAP is freely available for academic use. For the commerical use or license of FunGAP, please contact In-Geol Choi (email: igchoi (at) korea.ac.kr). Please, cite the following reference
Reference: Byoungnam Min Igor V Grigoriev In-Geol Choi, FunGAP: Fungal Genome Annotation Pipeline using evidence-based gene model evaluation (2017), Bioinformatics, Volume 33, Issue 18, Pages 2936–2937, https://doi.org/10.1093/bioinformatics/btx353
Because FunGAP implements many dependent programs, you may encounter issues during installation. Please don't hesitate to post on *Issues* or contact me ([email protected]) for help.
These steps were tested and confirmed in freshly installed Ubuntu 18.04 LTS.
- Hisat2 v2.1.0
- Trinity v2.6.6
- RepeatModeler v1.0.11
- Maker v2.31.10
- GeneMark-ES/ET v4.38
- Augustus v3.3
- Braker v1.9
- BUSCO v3.0.2
- Pfam_scan v1.6
- BLAST v2.6.0+
- Samtools v1.9
- Bamtools v2.4.1
For robust installation, we recommend to use Anaconda environment and install dependent programs and libraries as much as possible in the environment.
Download and install Anaconda2 (We assume that you install it in $HOME/anaconda2
)
cd $HOME
wget https://repo.continuum.io/archive/Anaconda2-2018.12-Linux-x86_64.sh
bash Anaconda2-2018.12-Linux-x86_64.sh
echo ". ~/anaconda2/etc/profile.d/conda.sh" >> ~/.bashrc
source ~/.bashrc
conda update conda
conda create -n fungap
conda activate fungap
This step is important otherwise Maker will stop
conda config --remove channels bioconda
conda config --remove channels conda-forge
conda config --add channels bioconda/label/cf201901
conda config --add channels conda-forge/label/cf201901
conda install -c bioconda augustus rmblast maker trinity hisat2 braker busco blast pfam_scan
pip install biopython bcbio-gff networkx markdown2 matplotlib
cpanm Hash::Merge Logger::Simple Parallel::ForkManager YAML
Download FunGAP using GitHub clone. Suppose we are installing FunGAP in your $HOME
directory, but you are free to change the location. $FUNGAP_DIR
is going to be your FunGAP installation directory.
cd $HOME
git clone https://github.com/CompSynBioLab-KoreaUniv/FunGAP.git
cd FunGAP/
export FUNGAP_DIR=$(pwd)
Download Pfam and BUSCO databases in your $FUNGAP_DIR/db
directory.
ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release
cd $FUNGAP_DIR # Change directory to FunGAP installation directory
mkdir -p db/pfam
cd db/pfam
wget ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.dat.gz
gunzip Pfam-A.hmm.gz
gunzip Pfam-A.hmm.dat.gz
hmmpress Pfam-A.hmm # HMMER package (would be automatically installed in the above Anaconda step)
There are various databases in BUSCO, so just download one of them fitted to your target genome. Here are example commands.
cd $FUNGAP_DIR
mkdir -p db/busco
cd db/busco
wget https://busco.ezlab.org/datasets/fungi_odb9.tar.gz
wget https://busco.ezlab.org/datasets/ascomycota_odb9.tar.gz
wget https://busco.ezlab.org/datasets/basidiomycota_odb9.tar.gz
tar -zxvf fungi_odb9.tar.gz
tar -zxvf ascomycota_odb9.tar.gz
tar -zxvf basidiomycota_odb9.tar.gz
Go to this site and download GeneMark-ES/ET. http://topaz.gatech.edu/GeneMark/license_download.cgi Don't forget to download the key, too.
mkdir $FUNGAP_DIR/external/
mv gm_et_linux_64.tar.gz gm_key_64.gz $FUNGAP_DIR/external/ # Move your downloaded files to this directory
cd $FUNGAP_DIR/external/
tar -zxvf gm_et_linux_64.tar.gz
gunzip gm_key_64.gz
cp gm_key_64 ~/.gm_key
(if required) You may need to install certain Perl modules. Because GeneMark forces to use /usr/bin/perl
instead of conda-installed perl, you should install the modules for /usr/bin/perl
(i.e., not in conda environment). Alternatively, you can modify first lines of GeneMark perl scripts from #!/usr/bin/perl
to #!/usr/bin/env perl
conda deactivate
sudo apt-get update
sudo apt-get install build-essential
sudo cpan App::cpanminus # Install cpanm if you do not have one
sudo cpanm Hash::Merge Logger::Simple Parallel::ForkManager YAML
conda activate fungap
cd $FUNGAP_DIR/external/gm_et_linux_64/gmes_petap
./gmes_petap.pl
Note: RepeatModerler is available in Anaconda2 (https://anaconda.org/bioconda/repeatmodeler), but conda-installed program did not work at the moment. Installation seemed okay, but no result was produced. I will update this whenever working RepeatModeler is available.
perl -v
It should be > 5.8.8
cd $FUNGAP_DIR/external/
wget http://www.repeatmasker.org/RepeatModeler/RECON-1.08.tar.gz
tar -zxvf RECON-1.08.tar.gz
cd RECON-1.08/src/
make
make install
cd $FUNGAP_DIR/external/
wget http://www.repeatmasker.org/RepeatScout-1.0.5.tar.gz
tar -zxvf RepeatScout-1.0.5.tar.gz
cd RepeatScout-1
make
cd $FUNGAP_DIR/external/
mkdir nseg
cd nseg
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/genwin.c
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/genwin.h
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/lnfac.h
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/makefile
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/nmerge.c
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/nseg.c
wget ftp://ftp.ncbi.nih.gov/pub/seg/nseg/runnseg
make
I could not use conda-installed RepeatMasker for RepeatModeler installation. So I manually installed.
cd $FUNGAP_DIR/external/
wget http://www.repeatmasker.org/RepeatMasker-open-4-0-8.tar.gz
tar -zxvf RepeatMasker-open-4-0-8.tar.gz
cd RepeatMasker
perl ./configure
- Note:
trf
andrmblastn
are located at~/anaconda2/envs/fungap/bin
.
cd $FUNGAP_DIR/external/
wget http://www.repeatmasker.org/RepeatModeler/RepeatModeler-open-1.0.11.tar.gz
tar -zxvf RepeatModeler-open-1.0.11.tar.gz
cd RepeatModeler-open-1.0.11/
perl ./configure
- Note:
trf
andrmblastn
is located at~/anaconda2/envs/fungap/bin
cd $FUNGAP_DIR/external/RepeatModeler-open-1.0.11/
./BuildDatabase --help
./RepeatModeler --help
This script allows users to set and test (by --help command) all the dependencies. If this script runs without any issue, you are ready to run FunGAP!
cd $FUNGAP_DIR
python set_dependencies.py \
--pfam_db_dir db/pfam \
--busco_db_dir db/busco/basidiomycota_odb9/ \
--genemark_dir external/gm_et_linux_64/gmes_petap/ \
--repeat_modeler_dir external/RepeatModeler-open-1.0.11