Tag: Comparative Genomics Resource (CGR)

New! Introducing the Multiple Comparative Genome Viewer (MCGV) Beta Release

New! Introducing the Multiple Comparative Genome Viewer (MCGV) Beta Release

NLM’s NCBI is excited to introduce the Multiple Comparative Genome Viewer (MCGV), a new tool in active development that allows you to visualize an alignment of multiple eukaryotic genomes. While our existing Comparative Genome Viewer (CGV) allows you to compare pairs of eukaryotic assemblies, the new MCGV tool can help you analyze multiple assemblies in a single view.  

MCGV displays are based on multiple whole genome sequence alignments. You can navigate these alignments in the viewer to track evolutionarily related regions across strains or species. Focus in on a particular genome region to investigate how differences in genome structure may have contributed to differences in gene sequence and function. You can access MCGV by clicking on the “Visualize gene across species” link via the gene search results page.   Continue reading “New! Introducing the Multiple Comparative Genome Viewer (MCGV) Beta Release”

NCBI Resources Highlighted in 2025 Nucleic Acids Research Database Issue

NCBI Resources Highlighted in 2025 Nucleic Acids Research Database Issue

The 2025 Nucleic Acids Research Database Issue features papers from NCBI staff on ClinVar, PubChem, GenBank, RefSeq, and more. The citations are available in PubMed with full-text available in PubMed Central (PMC). To read an article, click on the PMCID number listed below. 

Database resources of the National Center for Biotechnology Information in 2025

PMCID: PMC11701734

NCBI provides online information resources for biology, including the GenBank® nucleic acid sequence repository and the PubMed® repository of citations and abstracts published in life science journals. NCBI is currently developing the NIH Comparative Genomics Resource (CGR) to facilitate reliable comparative genomics analyses with an NCBI Toolkit and community collaboration.

Continue reading “NCBI Resources Highlighted in 2025 Nucleic Acids Research Database Issue”

RefSeq Release 228 is Available!

RefSeq Release 228 is Available!

Check out RefSeq release 228, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

What’s included in this release?

As of January 3, 2025, this full release incorporates genomic, transcript, and protein data containing:

  • 513,096,240 records, including
  • 391,903,900 proteins
  • 67,997,702 RNAs
  • Sequences from 162,138 organisms 

Continue reading “RefSeq Release 228 is Available!”

Try Out a Development Version of NCBI’s Publicly Available Annotation Tool, EGAPx

Try Out a Development Version of NCBI’s Publicly Available Annotation Tool, EGAPx

Latest release now available 

Are you generating genomes for vertebrates, arthropods, or plants, and looking for a way to generate high-quality genome annotation? NCBI is working on a public version of the NCBI Eukaryotic Genome Annotation Pipeline (EGAPx), and the latest developmental release is now available for testing and feedback. Continue reading “Try Out a Development Version of NCBI’s Publicly Available Annotation Tool, EGAPx”

RefSeq Release 227 is Available!

RefSeq Release 227 is Available!

Check out RefSeq release 227, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

What’s included in this release?

As of November 4, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 497,549,107 records, including
  • 377,783,847 proteins
  • 66,987,567 RNAs
  • Sequences from 159,324 organisms 

Continue reading “RefSeq Release 227 is Available!”

Expansion of Ortholog Data for RefSeq Arthropods

Expansion of Ortholog Data for RefSeq Arthropods

250K+ new Hymenoptera orthologs added 

NCBI is excited to announce the expansion of ortholog data for RefSeq arthropods. This update expands the breadth of arthropod orthology information, offering new insights into evolutionary biology, gene function, and shared pathways. Whether you’re studying insect genetics, developmental biology, or comparative genomics, the expanded ortholog data opens up new possibilities for research. Check out our previous blog to learn how to access the orthologs using NCBI Datasets.  Continue reading “Expansion of Ortholog Data for RefSeq Arthropods”

Access Public Reports of Foreign Contamination Screen (FCS) Tool Results

Access Public Reports of Foreign Contamination Screen (FCS) Tool Results

Do you use genomes from NCBI and are concerned they may contain contaminant sequences? Now you can view reports generated for all prokaryotic and eukaryotic genomes with NCBI’s quality assurance tool, Foreign Contamination Screen (FCS), to better understand possible issues that may affect your studies.  

What reports are available? 
  • Summary reports to select better assemblies at thresholds of your choosing. 
  • Detailed reports to remove or mask contaminant sequences so they don’t adversely affect analyses. This is particularly useful for building k-mer databases. 
  • Individual assembly reports available through the FTP link located on NCBI Datasets genome pages.
  • Reports are available for all eukaryotic and prokaryotic GenBank and RefSeq assemblies, currently covering over 2.7 million assemblies. 
  • A README to understand how to interpret and use contamination reports. 

Continue reading “Access Public Reports of Foreign Contamination Screen (FCS) Tool Results”

RefSeq Release 226 is Available!

RefSeq Release 226 is Available!

Check out RefSeq release 226, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also divided by logical groupings.

What’s included in this release?

As of September 13, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 472,512,852 records
  • 355,355,673 proteins
  • 65,576,846 RNAs
  • Sequences from 155,792 organisms

Continue reading “RefSeq Release 226 is Available!”

NCBI’s First-Ever BioEd Summit Was a Success!

NCBI’s First-Ever BioEd Summit Was a Success!

NCBI hosted its first-ever BioEd Summit: Crafting Student-Centric Curricula with NCBI resources. This week-long, in-person event for science educators across the U.S. was held on the National Institutes of Health (NIH) campus in Bethesda, MD, from August 5-9, 2024. 

Event Details 

During the week, educators participated in morning sessions including interactive workshops on NCBI educational curricular design, the use of various NCBI resources in teaching, and detailed hands-on discussions and practice with NCBI tools. A panel discussion on employing novel, data-driven, active learning exercises in science classes with leaders from several institutions including:   Continue reading “NCBI’s First-Ever BioEd Summit Was a Success!”

Access and Download Sequence Data and Metadata Using NCBI Datasets

Access and Download Sequence Data and Metadata Using NCBI Datasets

Goodbye Assembly and Genome, hello NCBI Datasets!

Exciting news! NCBI has streamlined and modernized how you access and download genome, taxonomy, and gene information with NCBI Datasets. As previously announced, NCBI Datasets is replacing the legacy Genome and Assembly resources providing you a single entry point to genome datasets. Effective today, the legacy pages are retired and no longer available.

Please note there will be no changes to how you programmatically access the databases using E-Utilities or EDirect. Continue reading “Access and Download Sequence Data and Metadata Using NCBI Datasets”