Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan;18(1):188-96.
doi: 10.1101/gr.6743907. Epub 2007 Nov 19.

MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes

Affiliations

MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes

Brandi L Cantarel et al. Genome Res. 2008 Jan.

Abstract

We have developed a portable and easily configurable genome annotation pipeline called MAKER. Its purpose is to allow investigators to independently annotate eukaryotic genomes and create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab initio gene predictions, and automatically synthesizes these data into gene annotations having evidence-based quality indices. MAKER is also easily trainable: Outputs of preliminary runs are used to automatically retrain its gene-prediction algorithm, producing higher-quality gene-models on subsequent runs. MAKER's inputs are minimal, and its outputs can be used to create a GMOD database. Its outputs can also be viewed in the Apollo Genome browser; this feature of MAKER provides an easy means to annotate, view, and edit individual contigs and BACs without the overhead of a database. As proof of principle, we have used MAKER to annotate the genome of the planarian Schmidtea mediterranea and to create a new genome database, SmedGD. We have also compared MAKER's performance to other published annotation pipelines. Our results demonstrate that MAKER provides a simple and effective means to convert a genome sequence into a community-accessible genome database. MAKER should prove especially useful for emerging model organism genome projects for which extensive bioinformatics resources may not be readily available.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
SmedGD, the GMOD-based S. mediterranea genome database constructed directly from MAKER’s outputs (http://smedgd.neuro.utah.edu).
Figure 2.
Figure 2.
MAKER Overview. MAKER uses four external executables: RepeatMasker, BLAST, SNAP, and Exonerate. Actions corresponding to the five basic steps of automatic annotation are shown in red.
Figure 3.
Figure 3.
Apollo view of a MAKER gene annotation and its associated evidence. Evidence gathered by MAKER’s compute pipeline (upper panel) is synthesized into the resulting MAKER annotation (lower panel).

Similar articles

Cited by

References

    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J., Gish W., Miller W., Myers E.W., Lipman D.J., Miller W., Myers E.W., Lipman D.J., Myers E.W., Lipman D.J., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Bairoch A., Apweiler R., Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28:45–48. doi: 10.1093/nar/28.1.45. - DOI - PMC - PubMed
    1. Bateman A., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Marshall M., Moxon S., Sonnhammer E.L., Moxon S., Sonnhammer E.L., Sonnhammer E.L., et al. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–D141. doi: 10.1093/nar/gkh121. - DOI - PMC - PubMed
    1. Edgar R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. - DOI - PMC - PubMed
    1. Edgar R.C., Myers E.W., Myers E.W. PILER: Identification and classification of genomic repeats. Bioinformatics. 2005;21:i152–i158. - PubMed

Publication types