MaizeGDB hosts 67 genomes.
To learn how to contribute your genome to MaizeGDB, click
here.
This search form allows you to enter basic information (e.g., cultivar,
genome accession), including partial names, to search for a genome assembly.
Genome sets
+ B73 representative reference genome More information is available here. + Pan-Andropogoneae genome sequence More information is available here.
Round 1 release: November 1, 2022
+ NAM founders, developed by the Whole-genome assembly of the NAM founders project. More information is available here.
Zm-A632-REFERENCE-CAAS_FIL-1.0
Zm-Chang-7_2-REFERENCE-CAAS_FIL-1.0 Zm-Dan340-REFERENCE-CAAS_FIL-1.0 Zm-Huangzaosi-REFERENCE-CAAS_FIL-1.0 Zm-Jing724-REFERENCE-CAAS_FIL-1.0 Zm-Jing92-REFERENCE-CAAS_FIL-1.0 Zm-Oh43-REFERENCE-CAAS_FIL-1.0 Zm-PH207-REFERENCE-CAAS_FIL-1.0 Zm-S37-REFERENCE-CAAS_FIL-1.0 Zm-Xu178-REFERENCE-CAAS_FIL-1.0 Zm-Ye478-REFERENCE-CAAS_FIL-1.0 Zm-Zheng58-REFERENCE-CAAS_FIL-1.0 + European flint lines , developed as part of the MAZE project. More information is available here.
All genome assemblies hosted at MaizeGDB
Genome sequencing projects known to be in progress
Genome assembly nomenclature
Genome assemblies are named according to the following formula: [species abbreviation]-[cultivar or accession]-[quality]-[provider].[version] Examples: Zm-B73-REFERENCE-GRAMENE-4.0 (B73 reference genome version 4, from Gramene), Zx-PI566673-REFERENCE-YAN-1.0 (Zea mays spp. mexicana, PI 566673, reference genome version 1, from the Yan lab) Genome identifiers are shorter versions used as prefixes to gene model identifiers. They take the form: [species abbreviation][5 digits] Examples: Zm0004d (Zm-B73-REFERENCE-GRAMENE-4.0) Zx0001a (Zx-PI566673-REFERENCE-YAN-1.0) Annotation identifiers are built from genome identifiers: [genome identifer].[version] Examples: Zm0004d.2 (Zm-B73-REFERENCE-GRAMENE-4.0) Zx0001a.1 (Zx-PI566673-REFERENCE-YAN-1.0) See the nomenclature summary and the Genome Nomemclature Guidelines for more information.
Genome assemblies at MaizeGDB
MaizeGDB hosts all Zea mays genome assemblies that meet a minimum set of requirements. See more information about submitting a genome assembly to MaizeGDB here. MaizeGDB provides three levels of support:
Genome downloads
Flow cytometry for selected lines.
Definition of terms
See the right-hand side bar for more terms and explanations. About genome assemblies
Contig: a set of overlapping DNA segments that together represent a consensus
region of DNA. In bottom-up sequencing projects, a contig refers to overlapping
sequence data (reads); in top-down sequencing projects, contig refers to the
overlapping BAC (Bacterial Artificial Chromosome) clones that form a physical map of
the genome that is used to guide sequencing and assembly.
Gaps: Gaps are regions of unknown sequence. They are usually represented as a run of Ns. Gaps can be of known or unknown length. Gaps of known length are often filled with the number of Ns representing the missing length of sequence: for instance, if the gap is known to be 150bp in size, then 150 Ns will span the gap. These often represent repeat elements that could not be sequenced through. Gaps of unknown size are usually represented by a fixed number of Ns; 100 is what GenBank prefers. Gaps can be found in scaffolds between contig joins, and in pseudomolecules between scaffold joins. More information can be found here. Genome assembly: When the contigs are assembled into larger pieces called scaffolds. One efficient way of assembling scaffolds is to use an optical map, which maps restriction sites to the genome and is used to anchor scaffolds based on unique restriction site location patterns. Genome sequencing: Short reads are sequenced and then their overlapping sequence is assembled into longer pieces called contigs. Companies like PacBio can now make reads that are >10kb long. Pseudomolecule: Chromosome sequence that is made up of assembled scaffolds. Pseudomolecule assembly: When scaffolds are oriented to create whole chromosomes. This orientation can be done using physical map data that corresponds to genetic map data for your genome of interest (i.e. SNPs that are associated with genetic markers). Or, orientation can be done using syntenic data between your assembly and a reference genome (also called reference-guided assembly). This can be complicated since any inversions or assembly errors unique to the reference genome can be incorrectly propagated in your assembly. However, this is sometimes the best method if no genetic map for your genome is available. Scaffold: Sequence that is made up of assembled contigs. These scaffolds can span many megabases. Download this helpful list of terms from Pacific Biosciences. |