Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan;37(Database issue):D205-10.
doi: 10.1093/nar/gkn845. Epub 2008 Nov 4.

CDD: specific functional annotation with the Conserved Domain Database

Affiliations

CDD: specific functional annotation with the Conserved Domain Database

Aron Marchler-Bauer et al. Nucleic Acids Res. 2009 Jan.

Abstract

NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution. The collection can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, and is also part of NCBI's Entrez query and retrieval system, cross-linked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI's Entrez system, and CDD's collection of models can be queried with novel protein sequences via the CD-Search service at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with the latest version of CDD, v2.14, information from redundant and homologous domain models is summarized at a superfamily level, and domain annotation on proteins is flagged as either 'specific' (identifying molecular function with high confidence) or as 'non-specific' (identifying superfamily membership only).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CDD-based annotation on a recently predicted protein sequence. This summary is the default concise version of the annotation view as generated by CD-Search, using precalculated alignment information. The view is divided into two panels, a graphical summary (items a through d) and a table detailing the matches (items e and f). The query sequence is represented as a gray bar in the top portion of the graphical summary, with a ruler indicating sequence length and coordinates. (a) ‘Specific hits’ to NCBI-curated domain models are indicated in a separate area below the query sequence, and the corresponding balloons are rendered in bright colors. The extent of the hits also defines annotations with conserved domain ‘Superfamilies’, which are indicated in the area below the ‘Specific hits’, and enclosed in boxes to indicate superfamily relationships. If the full display is selected, an area summarizing ‘Non-specific hits’ will be shown as well, and the boxes will be drawn to resolve superfamily relationships, where highest ranked match for each superfamily defines the extents of each corresponding box. ‘Non-specific hits’ and ‘Superfamilies’ balloons are rendered in pastel colors, with each homologous superfamily being assigned a separate color. (b) If a region of the query has no ‘Specific hits’, only the ‘Superfamilies’ annotation is shown in the concise default display. If a match to a conserved domain model is incomplete, as in this case, the balloon is rendered with a jagged edge to indicate a missing region. (c) In the default concise display, matches to multi-domain models are rendered as gray balloons in a separate area of the summary graph. Only the best-ranked nonoverlapping multi-domain models are shown. (d) Functional sites as annotated on NCBI-curated domain models are mapped to the query sequence. Sites are mapped from the highest ranked model only, and they are colored to correspond to their source model. When no ‘Specific hits’ are available, such as in (b), sites may still be mapped if they have been annotated on the parent model of a hierarchy that gave a ‘Non-specific hit’. Both conserved domain balloons and site annotations are hot-linked so that moving the mouse over the objects displays pop-ups with additional information, and so that clicking on the objects generates summary pages for the particular domain model, embedding the user query sequence in the alignment for further analysis, if applicable. (e) A table view summarizes what the graphical view indicates as well, listing E-values, multi-domain status and various identifiers for the conserved domain models identified as matches. The table rows can be expanded (f) to display detailed sequence alignment information between the query and the domain model's consensus sequence. An alignment of all sequences comprising a domain model, with or without the query sequence embedded, is accessible by clicking on the domain's balloon representation in the graphical summary or its unique numerical identifier (PSSM-Id) in the tabular summary, respectively.

Similar articles

  • CDD: a conserved domain database for interactive domain family analysis.
    Marchler-Bauer A, Anderson JB, Derbyshire MK, DeWeese-Scott C, Gonzales NR, Gwadz M, Hao L, He S, Hurwitz DI, Jackson JD, Ke Z, Krylov D, Lanczycki CJ, Liebert CA, Liu C, Lu F, Lu S, Marchler GH, Mullokandov M, Song JS, Thanki N, Yamashita RA, Yin JJ, Zhang D, Bryant SH. Marchler-Bauer A, et al. Nucleic Acids Res. 2007 Jan;35(Database issue):D237-40. doi: 10.1093/nar/gkl951. Epub 2006 Nov 29. Nucleic Acids Res. 2007. PMID: 17135202 Free PMC article.
  • CDD: a Conserved Domain Database for protein classification.
    Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Liebert CA, Liu C, Lu F, Marchler GH, Mullokandov M, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Yamashita RA, Yin JJ, Zhang D, Bryant SH. Marchler-Bauer A, et al. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D192-6. doi: 10.1093/nar/gki069. Nucleic Acids Res. 2005. PMID: 15608175 Free PMC article.
  • CDD: a Conserved Domain Database for the functional annotation of proteins.
    Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH. Marchler-Bauer A, et al. Nucleic Acids Res. 2011 Jan;39(Database issue):D225-9. doi: 10.1093/nar/gkq1189. Epub 2010 Nov 24. Nucleic Acids Res. 2011. PMID: 21109532 Free PMC article.
  • CDD: a database of conserved domain alignments with links to domain three-dimensional structure.
    Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH. Marchler-Bauer A, et al. Nucleic Acids Res. 2002 Jan 1;30(1):281-3. doi: 10.1093/nar/30.1.281. Nucleic Acids Res. 2002. PMID: 11752315 Free PMC article.
  • The SUPERFAMILY database in structural genomics.
    Gough J. Gough J. Acta Crystallogr D Biol Crystallogr. 2002 Nov;58(Pt 11):1897-900. doi: 10.1107/s0907444902015160. Epub 2002 Oct 21. Acta Crystallogr D Biol Crystallogr. 2002. PMID: 12393919 Review.

Cited by

References

    1. Finn RD, Tate J, Mistry I, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, et al. The Pfam protein families database. Nucleic Acids Res. 2008;36:D281–D288. - PMC - PubMed
    1. Letunic I, Copley RR, Pils B, Pinkert S, Schultz J, Bork P. SMART 5: domains in the context of genomes and networks. Nucleic Acids Res. 2006;34:D257–D260. - PMC - PubMed
    1. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al. The COG database: and updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41. - PMC - PubMed
    1. Wheeler DL, Barret T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Edgar R, Federhen S, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2008;36:D13–D21. - PMC - PubMed
    1. Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 2002;30:281–283. - PMC - PubMed

Publication types

-