Tag: Datasets

Comparing Yeast Species Used in Beer Brewing and Bread Making

Comparing Yeast Species Used in Beer Brewing and Bread Making

Using the NIH Comparative Genomics Resource (CGR) to gain knowledge about less-researched organisms 

The scientific community relies heavily on model organism research to gain knowledge and make discoveries. However, focusing solely on these species misses valuable variation. Comparative genomics allows us to use knowledge from a model species, such as Saccharomyces cerevisiae, to understand traits in other, related organisms, such as Saccharomyces pastorianus or Saccharomyces eubayanus. Applying this information may provide valuable insight for other less-researched organisms. The National Institutes of Health (NIH) Comparative Genomics Resource (CGR) offers a cutting-edge NCBI toolkit of high-quality genomics data and tools to help you do just that.  Continue reading “Comparing Yeast Species Used in Beer Brewing and Bread Making”

RefSeq Release 220

RefSeq Release 220

RefSeq release 220 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of September 5, 2023, this full release incorporates genomic, transcript, and protein data containing:

  • 391,350,361 records
  • 289,333,423 proteins
  • 56,423,426 RNAs
  • sequences from 141,099 organisms 

Continue reading “RefSeq Release 220”

Which animals can catch and transmit human viral infections?

Which animals can catch and transmit human viral infections?

Using the NIH Comparative Genomics Resource (CGR) to understand susceptibility to SARS-CoV-2 and other infections 

Are you conducting research on animal-mediated transmission of human viral infections, such as COVID-19? The National Institutes of Health (NIH) Comparative Genomics Resource (CGR) offers a cutting-edge NCBI toolkit of high-quality genomics data and tools to help with comparative genomics analysis for eukaryotic genes, such as Angiotensin-converting enzyme 2 (ACE2) which is targeted by SARS-CoV-2.  

NCBI resources have been beneficial in helping the scientific community understand viral infections associated with public health crises, such as COVID-19 and Influenza, and can be used for study of emerging viruses that may represent new threats.   Continue reading “Which animals can catch and transmit human viral infections?”

NCBI at the Biodiversity Genomics Academy 2023 (BGA23)

NCBI at the Biodiversity Genomics Academy 2023 (BGA23)

Virtual Talks, September 14, 2023

NCBI will be presenting virtually at the Biodiversity Genomics Academy 2023 (BGA23) on September 14, 2023. Our short, interactive talks will focus on NCBI Datasets and the Comparative Genome Viewer (CGV). Both resources are part of the NIH Comparative Genomics Resource (CGR), which facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.

Recordings will be made available post-event! Continue reading “NCBI at the Biodiversity Genomics Academy 2023 (BGA23)”

New Annotations in RefSeq!

New Annotations in RefSeq!

In April, May, and June, the NCBI Eukaryotic Genome Annotation Pipeline released eighty-two new annotations in RefSeq!

Highlights:

  • Homo sapiens (human) T2T-CHM13v2.0 now includes many more alternative splice variants
  • Homo sapiens (human) GRCh38.p14 includes all transcripts from MANE v1.2, and includes over 78,000 new RefSeq Functional Element (RefSeqFE) features added since our last annotation in 2022
  • Mus musculus (house mouse) GRCm39 integrates curation for over 3,000 genes and 14,000 transcripts since September 2020
  • Rattus norvegicus (Norway rat) mRatBN7.2, including curation of over 5000 genes since our last annotation in 2021

New annotations: Continue reading “New Annotations in RefSeq!”

New & Improved NCBI Datasets Genome and Assembly Pages 

New & Improved NCBI Datasets Genome and Assembly Pages 

Legacy pages now redirect 

Effective July 10, 2023, NCBI’s Assembly and Genome record pages now redirect to new NCBI Datasets pages. As previously announced, these updates are part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data.   

The following pages have been updated:
  • The NCBI Assembly record pages now redirect to the new NCBI DatasetsGenomerecord pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST.  
  • The NCBIGenome record pages now redirect to the NCBI DatasetsTaxonomyrecord pages that provide a taxonomy-focused portal to genes, genomes, and additional NCBI resources.   

During this transition, you will have the option to return to the legacy Genome and Assembly record pages. We will remove the legacy pages in early 2024.   Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages “

Now Available! Access Data from the Human Pangenome Research Consortium (HPRC) at NCBI

Now Available! Access Data from the Human Pangenome Research Consortium (HPRC) at NCBI

Have you ever wondered how your genetic make-up is different from your neighbor’s? The National Human Genome Research Institute (NHGRI)-funded Human Pangenome Research Consortium (HPRC) has built an initial version of a pangenome reference – a collection of new human reference genome sequences representing 47 individuals from across the globe. Pangenome graphs relate the sequences from the different genomes to one another. The pangenome allows researchers to compare these DNA sequences and get a more detailed view of the range of human genetic variation. This is the first step toward the HPRC’s goal of building a pangenome reference comprised of the genomes of 350 individuals from diverse genetic backgrounds.  Continue reading “Now Available! Access Data from the Human Pangenome Research Consortium (HPRC) at NCBI”

Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP

Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP

Do you currently access genome assembly data through the FTP site? We are consolidating information provided in the ASSEMBLY_REPORTS and GENOME_REPORTS directories on the genomes FTP site to simplify access and ensure that you have the most accurate, up to date, and consistently reported data.  

The assembly_summary files in the ASSEMBLY_REPORTS directory are gaining information in newly added columns 24-38, including statistics about the assembly (size, GC content, genome size, and number of sequences) as well as details about the provided annotation (number of genes, annotation name and date). See example below (Table 1). Check out the README for more details about the contents of the summary files.  Continue reading “Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP”

Download Assembled Genome Data Programmatically with NCBI Datasets

Download Assembled Genome Data Programmatically with NCBI Datasets

As previously announced, NCBI’s Assembly and Genome record pages will be redirected to new NCBI Datasets pages in June 2023. The NCBI Datasets Command Line Interface (CLI) tools provide easy, straightforward programmatic downloads of assembled genome sequence data. We invite you to check them out and let us know what you think! 

Features & Benefits of NCBI Datasets
  • Get assembled genome sequence, annotation, and metadata, including transcripts and proteins, in one easy step. 
  • Querying is easy and flexible! Retrieve data using organism name, assembly accession, or BioProject accession. 
  • Request data for multiple assemblies in one request – it is now simpler and faster to download large amounts of data. 
  • Metadata is derived from multiple databases and metadata schemas are documented. 

Continue reading “Download Assembled Genome Data Programmatically with NCBI Datasets”

Revolutionize your research with the NIH Comparative Genomics Resource (CGR)

Revolutionize your research with the NIH Comparative Genomics Resource (CGR)

Unlock the full potential of eukaryotic research organisms and their genomic data with the National Institutes of Health (NIH) Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses through community collaboration as well as an NCBI toolkit of interconnected, interoperable data and tools.   

Comparative genomics is a field of study that uses the genomes of many different organisms to help us understand basic biological processes and human disease. NCBI is developing CGR to help researchers take full advantage of the rapidly growing number of eukaryotic organisms that, due to recent technological advances, now have sequenced genomes and associated data that can be used in these types of studies. Its NCBI toolkit offers new and modern resources for such analyses, and its emphasis on community collaboration brings new opportunities to share and connect data.   Continue reading “Revolutionize your research with the NIH Comparative Genomics Resource (CGR)”