Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Mar;12(3):813-24.
doi: 10.1074/mcp.O112.024372. Epub 2012 Dec 20.

The DegraBase: a database of proteolysis in healthy and apoptotic human cells

Affiliations

The DegraBase: a database of proteolysis in healthy and apoptotic human cells

Emily D Crawford et al. Mol Cell Proteomics. 2013 Mar.

Abstract

Proteolysis is a critical post-translational modification for regulation of cellular processes. Our lab has previously developed a technique for specifically labeling unmodified protein N termini, the α-aminome, using the engineered enzyme, subtiligase. Here we present a database, called the DegraBase (http://wellslab.ucsf.edu/degrabase/), which compiles 8090 unique N termini from 3206 proteins directly identified in subtiligase-based positive enrichment mass spectrometry experiments in healthy and apoptotic human cell lines. We include both previously published and unpublished data in our analysis, resulting in a total of 2144 unique α-amines identified in healthy cells, and 6990 in cells undergoing apoptosis. The N termini derive from three general categories of proteolysis with respect to cleavage location and functional role: translational N-terminal methionine processing (∼10% of total proteolysis), sites close to the translational N terminus that likely represent removal of transit or signal peptides (∼25% of total), and finally, other endoproteolytic cuts (∼65% of total). Induction of apoptosis causes relatively little change in the first two proteolytic categories, but dramatic changes are seen in endoproteolysis. For example, we observed 1706 putative apoptotic caspase cuts, more than double the total annotated sites in the CASBAH and MEROPS databases. In the endoproteolysis category, there are a total of nearly 3000 noncaspase nontryptic cleavages that are not currently reported in the MEROPS database. These studies significantly increase the annotation for all categories of proteolysis in human cells and allow public access for investigators to explore interesting proteolytic events in healthy and apoptotic human cells.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Experimental schema, database design, and database summary. (A), For all experiments, human cells were grown under standard conditions, either with or without treatment with apoptosis inducing agents. Cells are lysed and proteins biotinylated on their free α-amines using subtiligase, followed by purification and identification by LC-MS/MS. N termini identifications from every experiment were entered into the database to create the untreated and apoptotic datasets, and a subset apoptotic caspase-cleaved dataset for apoptotic N termini following aspartic acid cleavage. (B), The DegraBase database is structured around four main tables linking the experimental data to the MS identifications and external database information at both the N terminus and protein level (for more details see Supplemental File S1). (C), Summary statistics of the DegraBase for all experiments in the DegraBase and for both the untreated and apoptotic datasets (more details in Supplemental Table S1A). The blue box highlights the apoptotic caspase-cleaved dataset within the apoptotic dataset.
Fig. 2.
Fig. 2.
Dataset overlap. Venn diagrams show a larger overlap of unique protein (A) than peptide (B) identifications between untreated, apoptotic and apoptotic caspase-cleaved datasets. The apoptotic caspase-cleaved dataset is defined as a subset of the apoptotic dataset, and is therefore contained wholly within the apoptotic set.
Fig. 3.
Fig. 3.
iceLogo diagrams show amino acid frequencies. The untreated (A) and apoptotic (B) datasets show distinct patterns of amino acid frequency for the eight positions surrounding the labeled α-amine (P4-P4′) from all unique N termini. Enrichment (above the line) or depletion (below the line) of amino acid frequency is determined using the human SwissProt library release 2012_03 as background. The apoptotic caspase-cleaved iceLogo (C) represents all cleavages following aspartic acid (P1 = D) in the apoptotic dataset.
Fig. 4.
Fig. 4.
There are three distinct groups of proteolytic processing within the data: (i) processing around the methionine at the translational N terminus (N termini observed at residues 1 and 2), (ii) cleavage of possible secretory or transit peptides during organelle trafficking (observed residues 3–65), and (iii) other endoproteolytic events (considered as cuts after residue 65). The untreated and apoptotic datasets had similar levels of translational N terminus labeling (∼10%), but differed for the latter categories, with the apoptotic datasets having more cleavage events past residue 65. The apoptotic caspase-cleaved set is shifted even more toward endoproteolytic cleavages than the apoptotic set.
Fig. 5.
Fig. 5.
Initiator methionine processing. The iceLogos for the untreated (A, B) and apoptotic (C, D) sets are very similar for processing around the methionine at the translational N terminus. Enrichment or depletion of amino acid frequency is determined using human SwissProt library release 2012_03 as background. (A) and (C) represent retention (but not acetylation) of the initiator methionine (Met[1]); (B) and (D) represent removal of the initiator methionine without acetylation of the second residue (Xaa[2]). Two proteins labeled at residue 1 but annotated in UniProt as not containing an initiator methionine (Ig lambda chain V-IV region HII (P01717, serine) and 40S ribosomal protein S30 (P62861, lysine)) were removed from the datasets for the iceLogo creation.
Fig. 6.
Fig. 6.
Signal/Transit peptide removal sites. (A) Annotated mitochondrial proteins were greatly enriched for labeling between residues 10–65 compared with all other proteins in the untreated dataset, reflecting labeling at mitochondrial transit peptide removal sites. (B–D), iceLogos for subsets of the set of cleavages at positions 3 to 65 in the untreated dataset: (B), all unique N termini from proteins thought to be mitochondrial; (C), all N termini in the untreated dataset labeled between 10–65; and (D), all unique N termini thought to contain signal peptides.
Fig. 7.
Fig. 7.
Filled logos for endoproteolysis occurring at residue 66 or above for the untreated (A) and apoptotic (B) datasets with all aspartic acid cleavages removed (9% of the untreated and 28% of the apoptotic N termini). The size of each letter represents its relative frequency within the dataset. Distributions are very similar for the P1′ position, but show differences at the P1 position.

Similar articles

Cited by

References

    1. Arnesen T. (2011) Towards a functional understanding of protein N-terminal acetylation. PLoS Biol. 9, e1001074. - PMC - PubMed
    1. Starheim K. K., Gevaert K., Arnesen T. (2012) Protein N-terminal acetyltransferases: when the start matters. Trends Biochem. Sci. 37, 152–161 - PubMed
    1. van den Berg B. H., Tholey A. (2012) Mass spectrometry-based proteomics strategies for protease cleavage site identification. Proteomics 12, 516–529 - PubMed
    1. Staes A., Impens F., Van Damme P., Ruttens B., Goethals M., Demol H., Timmerman E., Vandekerckhove J., Gevaert K. (2011) Selecting protein N-terminal peptides by combined fractional diagonal chromatography. Nat. Protoc. 6, 1130–1141 - PubMed
    1. Impens F., Colaert N., Helsens K., Plasman K., Van Damme P., Vandekerckhove J., Gevaert K. (2010) MS-driven protease substrate degradomics. Proteomics 10, 1284–1296 - PubMed

Publication types

LinkOut - more resources

-