RefSeq: an update on prokaryotic genome annotation and curation
- PMID: 29112715
- PMCID: PMC5753331
- DOI: 10.1093/nar/gkx1068
RefSeq: an update on prokaryotic genome annotation and curation
Abstract
The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) provides annotation for over 95 000 prokaryotic genomes that meet standards for sequence quality, completeness, and freedom from contamination. Genomes are annotated by a single Prokaryotic Genome Annotation Pipeline (PGAP) to provide users with a resource that is as consistent and accurate as possible. Notable recent changes include the development of a hierarchical evidence scheme, a new focus on curating annotation evidence sources, the addition and curation of protein profile hidden Markov models (HMMs), release of an updated pipeline (PGAP-4), and comprehensive re-annotation of RefSeq prokaryotic genomes. Antimicrobial resistance proteins have been reannotated comprehensively, improved structural annotation of insertion sequence transposases and selenoproteins is provided, curated complex domain architectures have given upgraded names to millions of multidomain proteins, and we introduce a new kind of annotation rule-BlastRules. Continual curation of supporting evidence, and propagation of improved names onto RefSeq proteins ensures that the functional annotation of genomes is kept current. An increasing share of our annotation now derives from HMMs and other sets of annotation rules that are portable by nature, and available for download and for reuse by other investigators. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.
Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Figures
![Figure 1.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/5753331/bin/gkx1068fig1.gif)
![Figure 2.](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/5753331/bin/gkx1068fig2.gif)
Similar articles
-
RefSeq and the prokaryotic genome annotation pipeline in the age of metagenomes.Nucleic Acids Res. 2024 Jan 5;52(D1):D762-D769. doi: 10.1093/nar/gkad988. Nucleic Acids Res. 2024. PMID: 37962425 Free PMC article.
-
Genome annotation of disease-causing microorganisms.Brief Bioinform. 2021 Mar 22;22(2):845-854. doi: 10.1093/bib/bbab004. Brief Bioinform. 2021. PMID: 33537706 Free PMC article. Review.
-
RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.Nucleic Acids Res. 2021 Jan 8;49(D1):D1020-D1028. doi: 10.1093/nar/gkaa1105. Nucleic Acids Res. 2021. PMID: 33270901 Free PMC article.
-
NCBI Taxonomy: a comprehensive update on curation, resources and tools.Database (Oxford). 2020 Jan 1;2020:baaa062. doi: 10.1093/database/baaa062. Database (Oxford). 2020. PMID: 32761142 Free PMC article. Review.
-
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45. doi: 10.1093/nar/gkv1189. Epub 2015 Nov 8. Nucleic Acids Res. 2016. PMID: 26553804 Free PMC article.
Cited by
-
Regulatory sequence-based discovery of anti-defense genes in archaeal viruses.Nat Commun. 2024 May 2;15(1):3699. doi: 10.1038/s41467-024-48074-x. Nat Commun. 2024. PMID: 38698035 Free PMC article.
-
Concatenated ScaA and TSA56 Surface Antigen Sequences Reflect Genome-Scale Phylogeny of Orientia tsutsugamushi: An Analysis Including Two Genomes from Taiwan.Pathogens. 2024 Apr 3;13(4):299. doi: 10.3390/pathogens13040299. Pathogens. 2024. PMID: 38668254 Free PMC article.
-
Acidithiobacillia class members originating at sites within the Pacific Ring of Fire and other tectonically active locations and description of the novel genus 'Igneacidithiobacillus'.Front Microbiol. 2024 Apr 3;15:1360268. doi: 10.3389/fmicb.2024.1360268. eCollection 2024. Front Microbiol. 2024. PMID: 38633703 Free PMC article.
-
KEGG orthology prediction of bacterial proteins using natural language processing.BMC Bioinformatics. 2024 Apr 11;25(1):146. doi: 10.1186/s12859-024-05766-x. BMC Bioinformatics. 2024. PMID: 38600441 Free PMC article.
-
Interaction of bacteriophage P1 with an epiphytic Pantoea agglomerans strain-the role of the interplay between various mobilome elements.Front Microbiol. 2024 Mar 25;15:1356206. doi: 10.3389/fmicb.2024.1356206. eCollection 2024. Front Microbiol. 2024. PMID: 38591037 Free PMC article.
References
-
- Cole S.T., Brosch R., Parkhill J., Garnier T., Churcher C., Harris D., Gordon S.V., Eiglmeier K., Gas S., Barry C.E. 3rd et al. . Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998; 393:537–544. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources