Tag: Gene

Ortholog Groups Added for ~2 Million Insect Genes

Ortholog Groups Added for ~2 Million Insect Genes

Find evolutionarily related genes across insects and other arthropods on our new Ortholog webpages

NCBI recently released a set of orthologs for approximately 2 million insect genes. You can now find and access the orthologous genes, transcripts, and proteins by searching a species and gene name in NCBI All Databases, NCBI Gene, or NCBI Datasets. As previously described, these orthologs are based on comparisons to the Drosophila melanogaster annotated genome. Using Drosophila gene nomenclature for orthologs should lead to more informative gene symbols for insects and other arthropods.  Continue reading “Ortholog Groups Added for ~2 Million Insect Genes”

HomoloGene Now Redirects to NCBI Datasets Gene

HomoloGene Now Redirects to NCBI Datasets Gene

A new way to view and download related genes 

As previously announced, HomoloGene now automatically redirects to the NCBI Datasets Gene page giving you easy access to up-to-date sequence and homology data. The NCBI Datasets Gene Table provides a link to NCBI Orthologs with expanded gene and protein information and links to tools. NCBI Orthologs includes more genes and sequences for a growing range of taxa. See an example below. Legacy HomoloGene data remains available on the FTP site. Continue reading “HomoloGene Now Redirects to NCBI Datasets Gene”

Using NCBI Data and Tools for Your Research Project

Using NCBI Data and Tools for Your Research Project

Are you a biology student working on a research project? NCBI offers free access to a wide variety of resources and tools to help you find and download data for your project. 

How and why do you use our resources? Check out the example below:

Your professor has assigned you a research project looking at the sequence and structure of the TP53 gene in the domestic cat (Felis catus). In addition, you were asked to find information on this gene and its genomic region in other members of the cat family (Felidae).  Continue reading “Using NCBI Data and Tools for Your Research Project”

Now Available! Compare NCBI RefSeq and UniProt Datasets

Now Available! Compare NCBI RefSeq and UniProt Datasets

Do you need to compare and combine data based on NCBI RefSeq and UniProt datasets, and aren’t sure which proteins are comparable? For many years, NCBI Gene has provided information about the relationships between RefSeq and UniProt accessions courtesy of data imported from UniProt, but the tremendous growth of both datasets has led to large gaps in the data. We have developed a new process to compare the two datasets, first looking for 100% identical proteins and then checking the remaining sequences for similar matches in related taxa. The result is mapping information now covering over 170 million RefSeq proteins across the tree of life. 

You can find links to related UniProt accessions on individual NCBI Gene records. The entire dataset is available on our FTP site  Continue reading “Now Available! Compare NCBI RefSeq and UniProt Datasets”

New Way to View and Download Related Genes

New Way to View and Download Related Genes

Effective June 2023, the HomoloGene records will redirect to the Datasets Gene Table

Important Note: The redirect from HomoloGene to NCBI Datasets is delayed until the beginning of 2024.

Do you use HomoloGene to view and download data? You can now access updated homology data from NCBI Datasets through the Datasets Gene Table with connections to NCBI Orthologs. Go directly from a HomoloGene record to the Datasets Gene Table that will give you access to up-to-date sequence data and metadata. NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.

The Datasets Gene Table provides connections to the NCBI Ortholog interface (Figure 1) that provides the following data: 

  • Orthology data based on an updated algorithm that identifies orthologs spanning > 500 vertebrate species 
  • Similar gene data based on protein architectures that spans all eukaryotes 

Continue reading “New Way to View and Download Related Genes”

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

New Gene Information from the Alliance of Genome Resources

NCBI Gene now has descriptive information about genes from the Alliance of Genome Resources for organisms including Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus, Rattus norvegicus, and Saccharomyces cerevisiae.

Figure 1. The gene summary section of the Drosophila melanogaster slmb Gene Full Report showing the link to the corresponding record at the Alliance of Genome Resources.

The Summary section of the Gene Full Report page has Links to gene pages at the Alliance of Genome Resources (Figure 1). These are also in the right-hand sidebar of the Links to other resources section.   In the case of genes that don’t have a RefSeq summary,  we use  the textual gene descriptions from the Alliance of Genome resources.

The Drosphila slmb gene record shows the enhancements provided by the Alliance of Genome Resources.  The gene_info.gz files on the  Gene FTP site also include AllianceGenome references in the dbXrefs column.

New NCBI Gene Ensembl Comparison Expansion

NCBI Gene has added Ensembl Rapid Releases to the calculation of matching annotations between NCBI RefSeq and Ensembl. This has resulted in the inclusion of over 60 additional assemblies for a total of 241 organisms represented in the set. Matches are made based on transcript and CDS comparisons, and Ensembl gene, transcript, and protein identifiers for annotations similar to the NCBI RefSeq annotations are reported in NCBI Gene and in the gene2ensembl file on the Gene FTP site. The Ensembl annotation is also available in the graphical view and in NCBI’s Genome Data Viewer to give you a side-by-side view of how the annotations compare. Check out blue whale E2F1 for an example.

Figure 1. Balaenoptera musculus E2F transcription factor 1 in Genome Data Viewer

New NCBI Datasets home and documentation pages provide easier access

NCBI Datasets, the new set of services for downloading genome assembly and annotation data (previous Datasets posts), has redesigned and reorganized web pages to make it easier to find and access the services and documentation you need.

NCBI Datasets has a fresh new homepage (Figure 1) highlighting the types of data available through our tools. Available data include genome assemblies, genes, and SARS-CoV-2 genomic and protein data.  You can easily access these from the new page or learn more with our new documentation pages.

Figure 1. Features of the new Datasets homepage with quick access to help documentation including the Quickstart and How-to guides as well as access to Genome, Gene, and Coronavirus Data, and the Datasets and Dataformat command-line tools. Continue reading “New NCBI Datasets home and documentation pages provide easier access”