The National Center for Biotechnology Information's Protein Clusters Database
- PMID: 18940865
- PMCID: PMC2686591
- DOI: 10.1093/nar/gkn734
The National Center for Biotechnology Information's Protein Clusters Database
Abstract
Rapid increases in DNA sequencing capabilities have led to a vast increase in the data generated from prokaryotic genomic studies, which has been a boon to scientists studying micro-organism evolution and to those who wish to understand the biological underpinnings of microbial systems. The NCBI Protein Clusters Database (ProtClustDB) has been created to efficiently maintain and keep the deluge of data up to date. ProtClustDB contains both curated and uncurated clusters of proteins grouped by sequence similarity. The May 2008 release contains a total of 285 386 clusters derived from over 1.7 million proteins encoded by 3806 nt sequences from the RefSeq collection of complete chromosomes and plasmids from four major groups: prokaryotes, bacteriophages and the mitochondrial and chloroplast organelles. There are 7180 clusters containing 376 513 proteins with curated gene and protein functional annotation. PubMed identifiers and external cross references are collected for all clusters and provide additional information resources. A suite of web tools is available to explore more detailed information, such as multiple alignments, phylogenetic trees and genomic neighborhoods. ProtClustDB provides an efficient method to aggregate gene and protein annotation for researchers and is available at http://www.ncbi.nlm.nih.gov/sites/entrez?db=proteinclusters.
Figures
Similar articles
-
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.BMC Bioinformatics. 2006 Jan 24;7:36. doi: 10.1186/1471-2105-7-36. BMC Bioinformatics. 2006. PMID: 16433928 Free PMC article.
-
Survey of current protein family databases and their application in comparative, structural and functional genomics.J Chromatogr B Analyt Technol Biomed Life Sci. 2005 Feb 5;815(1-2):97-107. doi: 10.1016/j.jchromb.2004.11.010. J Chromatogr B Analyt Technol Biomed Life Sci. 2005. PMID: 15652801 Review.
-
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. doi: 10.1093/nar/gki025. Nucleic Acids Res. 2005. PMID: 15608248 Free PMC article.
-
Database resources of the National Center for Biotechnology Information: update.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D35-40. doi: 10.1093/nar/gkh073. Nucleic Acids Res. 2004. PMID: 14681353 Free PMC article.
-
The apoptosis database.Cell Death Differ. 2003 Jun;10(6):621-33. doi: 10.1038/sj.cdd.4401230. Cell Death Differ. 2003. PMID: 12761571 Review.
Cited by
-
Mixed waste contamination selects for a mobile genetic element population enriched in multiple heavy metal resistance genes.ISME Commun. 2024 May 9;4(1):ycae064. doi: 10.1093/ismeco/ycae064. eCollection 2024 Jan. ISME Commun. 2024. PMID: 38800128 Free PMC article.
-
Streptococcus taonis sp. nov., a novel bacterial species isolated from a blood culture of a patient.Arch Microbiol. 2024 Mar 15;206(4):168. doi: 10.1007/s00203-024-03884-x. Arch Microbiol. 2024. PMID: 38489085
-
GDPF: a data resource for the distribution of prokaryotic protein families across the global biosphere.Nucleic Acids Res. 2024 Jan 5;52(D1):D724-D731. doi: 10.1093/nar/gkad869. Nucleic Acids Res. 2024. PMID: 37823598 Free PMC article.
-
Draft Genome Sequences of 17 Campylobacter coli Strains Isolated from Animal and Food Sources in Brazil.Microbiol Resour Announc. 2023 Jul 18;12(7):e0031223. doi: 10.1128/mra.00312-23. Epub 2023 Jun 12. Microbiol Resour Announc. 2023. PMID: 37306576 Free PMC article.
-
The conserved domain database in 2023.Nucleic Acids Res. 2023 Jan 6;51(D1):D384-D388. doi: 10.1093/nar/gkac1096. Nucleic Acids Res. 2023. PMID: 36477806 Free PMC article.
References
-
- Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. - PubMed
-
- Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278:631–637. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous