Skip to main content
Log in

Genome survey sequencing and genetic background characterization of yellow horn based on next-generation sequencing

  • Original Article
  • Published:
Molecular Biology Reports Aims and scope Submit manuscript

Abstract

Yellowhorn (Xanthoceras sorbifolium Bunge) is an important wood oil tree species, with high ornamental and medicinal value. Nevertheless, genomic information of yellowhorn is currently unavailable. Here, for the first time, we conducted a genome survey of two yellowhorn cultivars, Zhongshi 4 and Zhongshi 9, which had distinct differences on the phenotype and drought resistance, to obtain knowledge on the genomic information by next generation sequencing (NGS). Meanwhile, its genome size was estimated using flow cytometry. As a result, the whole genome survey of Zhongshi 4 and Zhongshi 9 generated 34.40 and 39.55 GB sequence data. The genome size of Zhongshi 4 and Zhongshi 9 estimated were about 536.58 Mb and 569.52 Mb, which were closed to results of flow cytometry. The heterozygosity rates were calculated to be 0.75% and 0.89%, and the repeat rates were 60.08% and 62.00%. These reads were assembled into 1024,373 and 885,404 contigs with a N50 length of 1005 bp and 1219 bp, respectively, which were further assembled into 714,369 and 686,128 scaffolds with scaffold N50 length of ~ 1963 bp and ~ 1938 bp, total length of 386,915 Kb and 391,904 Kb. These results indicated that there was little difference in genome size and complexity among different cultivars. In addition, 63137 and 65271 high-quality genomic simple sequence repeat (SSR) markers in Zhongshi 4 and Zhongshi 9 were generated. We suggest that the technologies combining Illumina and PacBio, assisted by Hi-C and matching assemble software should be used to one of two yellowhorn cultivars genome sequencing. The result will help to design whole genome sequencing strategies for yellowhorn, and provided a large amount of gene resources for further excavation and utilization of yellowhorn.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The genome sequence reads obtained by Illumina Hiseq are available at NCBI-SRA. The Bioproject accession number is PRJNA483857 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA483857), and the Biosample accession number is SAMN10980100 (Zhongshi 9) (https://www.ncbi.nlm.nih.gov/biosample/SAMN10980100) and SAMN09748200 (Zhongshi 4) (https://www.ncbi.nlm.nih.gov/biosample/SAMN10980200). The Experiment number is SRX 5401612 and Run number is SRR8601604 (Zhongshi 9) and SRR7768199 (Zhongshi 4).

References

  1. Board E (1985) Flora of China. Science Press, Beijing, p 72

    Google Scholar 

  2. Yu HY, Fan SQ, Bi QX, Wang SX, Hu XY, Chen MY, Wang LB (2017) Seed morphology, oil content and fatty acid composition variability assessment in yellow horn (Xanthoceras sorbifolium Bunge) germplasm for optimum biodiesel production. Ind Crop Prod 97:425–430

    Article  CAS  Google Scholar 

  3. Venegas-Calerón M, Ruíz-Méndez MV, Martínez-Force E (2017) Characterization of Xanthoceras sorbifolium Bunge seeds: lipids, proteins and saponins content. Ind Crop Prod 109:192–198

    Article  CAS  Google Scholar 

  4. Taylor DC, Guo Y, Katavic V, Mietkiewska E, Francis T, Bettger W (2009) New seed oils for improved human and animal health: genetic manipulation of the brassicaceae for oils enriched in nervonic acid. In: Krishnan AB (ed) Modification of seed composition to promote health and nutrition. ASA-CSSA-SSSA Publishing, Madison, pp 219–233

    Google Scholar 

  5. Zhang Y, Xiao Lu, Xiao B, Yin M, Gu MY, Zhong R, Shang Y, Wang K, Wei L (2018) Research progress and application prospect of Xanthoceras sorbifolia for treating Alzheimer’s disease. Drug Eval Res 41(05):912–917

    Google Scholar 

  6. Qi Y, Ji XF, Chi TY, Liu P, Jin G, Xu Q, Jiao Q, Wang LH, Zou LB (2017) Xanthoceraside attenuates amyloid β peptide1-42-induced memory impairments by reducing neuroinflammatory responses in mice. Eur J Pharmacol 820:18–30

    Article  CAS  PubMed  Google Scholar 

  7. Hamilton JP, Buell CR (2012) Advances in plant genome sequencing. Plant J 70(1):177–190

    Article  CAS  PubMed  Google Scholar 

  8. Imelfort M, Edwards D, Dicks J (2009) De novo sequencing of plant genomes using second-generation technologies. Brief Bioinform 10(6):609–618

    Article  CAS  PubMed  Google Scholar 

  9. Aird D, Ross MG, Chen WS, Danielsson M, Fennell T, Russ C, Jaffe DB, Nusbaum C, Gnirke A (2011) Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol 12(2):R18

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Galbraith DW, Harkins KR, Maddox JM, Ayres NM, Sharma DP, Firoozabady E (1983) Rapid flow cytometric analysis of the cell cycle in intact plant tissues. Science 220(4601):1049–1051

    Article  CAS  PubMed  Google Scholar 

  11. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313(5793):1596–1604

    Article  CAS  PubMed  Google Scholar 

  12. Doležel J, Greilhuber J, Suda J (2007) Estimation of nuclear DNA content in plants using flow cytometry. Nat Protoc 2(9):2233–2244

    Article  CAS  PubMed  Google Scholar 

  13. Alberto CM, Sanso AM, Xifreda CC (2015) Chromosomal studies in species of Salvia (Lamiaceae) from Argentina. Bot J Linn Soc 141(4):483–490

    Article  Google Scholar 

  14. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. Embnet J 17(1):10–12

    Article  Google Scholar 

  16. Marçais G, Kingsford C (2011) A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6):764–770

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Chor B, Horn D, Goldman N, Levy Y, Massingham T (2009) Genomic DNA k-mer spectra: models and modalities. Genome Biol 10(10):R108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Liu BH, Shi YJ, Yuan JY, Yuuki G, Zhang H, Nan L, Li ZY, Chen YX, Mu DS, Fan W (2013) Estimation of genomic characteristics by analyzing K-mer frequency in de novo genome projects. ArXiv preprint arXiv. https://doi.org/10.1016/S0925-4005(96)02015-1

    Article  Google Scholar 

  19. Li X, Waterman MS (2003) Estimating the repeat structure and length of DNA sequences using L-tuples. Genome Res 13:1916–1922

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y (2010) The sequence and de novo assembly of the giant panda genome. Nature 463(7279):311–317

    Article  CAS  PubMed  Google Scholar 

  21. Parker SCJ, Margulies EH, Tullius TD (2008) The relationship between fine scale dna structure, GC content, and functional elements in 1% of the human genome. Genome Inform 20:199–211

    CAS  PubMed  Google Scholar 

  22. Lu M, An H, Li L (2016) Genome survey sequencing for the characterization of the genetic background of Rosa roxburghii tratt and leaf ascorbate metabolism genes. PLoS ONE 11(2):e147530

    Google Scholar 

  23. Thiel T, Michalek W, Varshney R, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Heor Appl Genet 106(3):411–422

    Article  CAS  Google Scholar 

  24. Rozen Z, Skaletsky H (1999) Primer3 on the WWW for general users and for biologist programmers. Bioinformatics methods and protocols. Humana Press, Totowa, pp 365–386

    Book  Google Scholar 

  25. Shangguan LF, Han J, Kayesh E, Sun X, Zhang CQ, Pervaiz T et al (2013) Evaluation of genome sequencing quality in selected plant species using expressed sequence tags. PLoS ONE 8(7):e69890

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Zhou W, Li B, Li L, Ma W, Liu Y, Feng S, Wang Z (2018) Genome survey sequencing of Dioscorea zingiberensis. Genome 61(8):567–574

    Article  CAS  PubMed  Google Scholar 

  27. Ha SH, Kim JB, Park JS, Lee SW, Cho KJ (2007) A comparison of the carotenoid accumulation in capsicum varieties that show different ripening colours: deletion of the capsanthin-capsorubin synthase gene is not a prerequisite for the formation of a yellow pepper. J Exp Bot 58(12):3135–3144

    Article  CAS  PubMed  Google Scholar 

  28. Rasch EM (1985) DNA “standards” and the range of accurate DNA estimates by Feulgen absorption microspectrophotometry. Prog Clin Biol Res 196:137–166

    CAS  PubMed  Google Scholar 

  29. Zhang JZ, Fan MY (2002) Determination of genome size and restriction fragment length polymorphism of four Chinese rickettsial isolates by pulsed-field gel electrophoresis. Acta Virol 46(1):25–30

    CAS  PubMed  Google Scholar 

  30. Lingohr E, Frost S, Johnson RP (2009) Determination of bacteriophage genome size by pulsed-field gel electrophoresis. Methods Mol Biol 502:19–25

    Article  CAS  PubMed  Google Scholar 

  31. Pellicer J, Leitch IJ (2013) The application of flow cytometry for estimating genome size and ploidy level in plants. Methods Mol Biol 1115:279–307

    Article  CAS  Google Scholar 

  32. Palumbo F, Galla G, Vitulo N, Barcaccia G (2018) First draft genome sequencing of fennel (Foeniculum vulgare Mill.): identification of simple sequence repeats and their application in marker-assisted breeding. Mol Breed 38(122):1–17

    CAS  Google Scholar 

  33. Wang CR, Yan HD, Li J, Zhou SF, Liu T, Zhang XQ, Huang LK (2018) Genome survey sequencing of purple elephant grass (Pennisetum purpureum Schum ‘Zise’) and identification of its SSR markers. Mol Breed 38:94–104

    Article  CAS  Google Scholar 

  34. Hirano M, Das S (2012) Editorial [hot topic: comparative genomics and genome evolution (guest editors: Sabyasachi Das and Masayuki Hirano)]. Curr Genomics 13(2):85

    Article  PubMed  PubMed Central  Google Scholar 

  35. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B (2009) Real-time dna sequencing from single polymerase molecules. Method Enzymol 323(5910):133–138

    CAS  Google Scholar 

  36. Roberts RJ, Carneiro MO, Schatz MC (2013) The advantages of SMRT sequencing. Genome Biol 14(7):405

    Article  PubMed  PubMed Central  Google Scholar 

  37. Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31(12):1119

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Xie T, Zheng JF, Liu S, Peng C, Zhou YM, Yang QY, Zhang HY (2015) De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana. Mol Plant 8(3):489–492

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

This work was financially supported by the Central Public-Interest Scientific Institution Basal Research Fund (CAFYBB2019QD001, CAFYBB2017QB001), the National Natural Science Foundation of China (31800571, 31870594).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Libing Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLSX 8973 kb)

Supplementary material 2 (XLSX 9454 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bi, Q., Zhao, Y., Cui, Y. et al. Genome survey sequencing and genetic background characterization of yellow horn based on next-generation sequencing. Mol Biol Rep 46, 4303–4312 (2019). https://doi.org/10.1007/s11033-019-04884-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11033-019-04884-7

Keywords

Navigation