Tag: GenBank

GenBank Release 261.0 is Available!

GenBank Release 261.0 is Available!

GenBank release 261.0 (6/18/2024) is now available on the NCBI FTP site. This release has 32.04 trillion bases and 4.51 billion records. 


The current release has:

  • 251,094,334 traditional records containing 3,387,240,663,231 base pairs of sequence data
  • 3,380,877,515 WGS records containing 27,900,199,328.,333 base pairs of sequence data
  • 746,753,803 bulk-oriented TSA records containing 695,405,769,319 base pairs of sequence data
  • 135,446,337 bulk-oriented TLS records containing 54,512,778,803 base pairs of sequence data 

Continue reading “GenBank Release 261.0 is Available!”

New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

Sequence data from the ongoing avian influenza A (H5N1) virus outbreak in cattle are now available through NLM’s NCBI resources NCBI Virus and NCBI Datasets.

These data were submitted by the U.S. Department of Agriculture (USDA), U.S. Centers for Disease Control and Prevention (CDC), the World Health Organization (WHO), Iowa State University, and St. Jude Children’s Research HospitalContinue reading “New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI”

International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website

International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website

Aims to broaden INSDC membership and attract diverse new members

The National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) and other Founding Members of the International Nucleotide Database Collaboration (INSDC) have enhanced its website, www.insdc.org, to provide comprehensive information on how interested parties from around the world can evaluate their readiness to participate in the INSDC. This effort supports INSDC’s aim to broaden membership and attract qualified nucleotide sequence databases. Web content now includes a formalized Founders Arrangement and a Membership Arrangement, along with other updated information about the INSDC mission, vision, governance, and technical documentation. In doing so, INSDC encourages interested parties to visit the INSDC website to learn more. Continue reading “International Nucleotide Database Collaboration (INSDC) Introduces Enhanced Website”

Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview

Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview

Recently, NCBI Virus SARS-CoV-2 Variants Overview moved from a manual to an automated process for selecting mutations required to define a lineage (e.g., Omicron, BA.2, JN.1, etc.). With this update, the SARS-CoV-2 Variant Overview provides coverage for all SARS-CoV-2 lineages and is no longer limited to only lineages with CDC status. The SARS-CoV-2 Variants Overview website reports results from analyzing both GenBank and unassembled Sequence Read Archive (SRA) sequence data. It allows you to view geographic and frequency trends of records assigned to Pango lineages and search for sequence records using lineage-defining or other mutations (example shown in Figure 1)  Continue reading “Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview”

GenBank Release 260.0 is Available!

GenBank Release 260.0 is Available!

GenBank release 260.0 (4/19/2024) is now available on the NCBI FTP site. This release has 31.18 trillion bases and 4.46 billion records.

The current release has:

  • 250,803,006 traditional records containing 3,213,818,003,787 base pairs of sequence data
  • 3,333,621,823 WGS records containing 27,225,116,587,937 base pairs of sequence data
  • 741,066,498 bulk-oriented TSA records containing 689,648,317,082 base pairs of sequence data
  • 135,115,766 bulk-oriented TLS records containing 53,492,243,256 base pairs of sequence data  Continue reading “GenBank Release 260.0 is Available!”
Foreign Contamination Screen Tool: Now Available in Galaxy!

Foreign Contamination Screen Tool: Now Available in Galaxy!

Check out our latest enhancements 

Do you submit genome assembly data to GenBank? If so, try out NCBI’s Foreign Contamination Screen (FCS) tool, a quality assurance process that you can run yourself. We will screen all prokaryotic and eukaryotic genome submissions to GenBank with this tool, but we encourage you to screen your data before submitting to save time. FCS offers sensitive contaminant detection to increase the quality of your genome submissions to GenBank. As part of our ongoing effort to improve your experience, we recently made several enhancements.  Continue reading “Foreign Contamination Screen Tool: Now Available in Galaxy!”

GenBank Release 259.0 is Available!

GenBank Release 259.0 is Available!

GenBank release 259.0 (12/22/2023) is now available on the NCBI FTP site. This release has 27.94 trillion bases and 3.96 billion records.

The current release has:

  • 247,777,761 traditional records containing 2,433,391,164,875 base pairs of sequence data
  • 2,775,205,599 WGS records containing 23,600,199,887,231 base pairs of sequence data
  • 701,336,089 bulk-oriented TSA records containing 659,924,904,311 base pairs of sequence data
  • 130,654,568 bulk-oriented TLS records containing 50,868,407,906 base pairs of sequence data

Continue reading “GenBank Release 259.0 is Available!”

Update to GenBank Qualifier

Update to GenBank Qualifier

‘Country’ will transition to ‘Geographic Location’ effective June 2024

As announced earlier this year, we will begin to systematically gather ‘location of collection’ and ‘date and time of collection’ for sequence data submitted to GenBank and the Sequence Read Archive (SRA).

As part of this effort and to make location data more accurate and informative, we are also changing the way this information is represented on GenBank records, consistent with the relevant field in BioSample. Continue reading “Update to GenBank Qualifier”

GenBank Release 258.0 is Available!

GenBank Release 258.0 is Available!

GenBank release 258.0 (11/2/2023) is now available on the NCBI FTP site. This release has 26.74 trillion bases and 3.85 billion records.

The current release has:

  • 247,777,761 traditional records containing 2,433,391,164,875 base pairs of sequence data
  • 2,775,205,599 WGS records containing 23,600,199,887,231 base pairs of sequence data
  • 701,336,089 bulk-oriented TSA records containing 659,924,904,311 base pairs of sequence data
  • 130,654,568 bulk-oriented TLS records containing 50,868,407,906 base pairs of sequence data 

Continue reading “GenBank Release 258.0 is Available!”

Upcoming Changes to Virus Data Resources at NCBI

Upcoming Changes to Virus Data Resources at NCBI

Effective June 2024, NCBI Virus will replace legacy virus web resources 

Coming soon! As part of our ongoing effort to enhance your experience and modernize our services, several of our legacy virus-related web resources will be replaced by NCBI Virus – our community portal for viral sequence data. NCBI Virus is more comprehensive, modernized, and has more powerful features and analysis tools than our legacy resources.  

What will change?

Below is a list of the legacy virus resources that will be replaced by NCBI Virus. The list includes a description of features that will continue to be supported through NCBI Virus:  Continue reading “Upcoming Changes to Virus Data Resources at NCBI”