Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Dec 1;22(12):3663-3675.
doi: 10.1021/acs.jproteome.3c00416. Epub 2023 Nov 8.

Top-Down Proteomics and the Challenges of True Proteoform Characterization

Affiliations
Review

Top-Down Proteomics and the Challenges of True Proteoform Characterization

Allen Po et al. J Proteome Res. .

Abstract

Top-down proteomics (TDP) aims to identify and profile intact protein forms (proteoforms) extracted from biological samples. True proteoform characterization requires that both the base protein sequence be defined and any mass shifts identified, ideally localizing their positions within the protein sequence. Being able to fully elucidate proteoform profiles lends insight into characterizing proteoform-unique roles, and is a crucial aspect of defining protein structure-function relationships and the specific roles of different (combinations of) protein modifications. However, defining and pinpointing protein post-translational modifications (PTMs) on intact proteins remains a challenge. Characterization of (heavily) modified proteins (>∼30 kDa) remains problematic, especially when they exist in a population of similarly modified, or kindred, proteoforms. This issue is compounded as the number of modifications increases, and thus the number of theoretical combinations. Here, we present our perspective on the challenges of analyzing kindred proteoform populations, focusing on annotation of protein modifications on an "average" protein. Furthermore, we discuss the technical requirements to obtain high quality fragmentation spectral data to robustly define site-specific PTMs, and the fact that this is tempered by the time requirements necessary to separate proteoforms in advance of mass spectrometry analysis.

Keywords: phosphorylation; post-translational modification; proteoform; top-down proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Top-down (left, blue) vs bottom-up (right, red) proteomics: Differences in spectral characteristics and complexity. Top-down investigations allow for the analysis of intact protein sequences and any mass shifts in the form of PTMs, truncations, etc. Bottom-up analyses typically provide more detail on the localization of mass shifts (modifications, SNPs), as well as being comparatively simpler in terms of analytical and computational complexity, but can miss regions of the protein that may be modified. Potential sites and type of modifications are represented (Ub, ubiquitin; Me, methylation; P, phosphorylation; Ac, acetylation); CE, capillary electrophoresis; LC, liquid chromatography. Note that some regions of the protein are not detected following digestion in a bottom up pipeline.
Figure 2
Figure 2
Challenges of kindred proteoform analysis. Kindred proteoforms (proteins as red, green, or blue) that have the same base sequence and the same modification type (P, phosphorylation) but in different numbers and at different sites (represented by a different color P) add to analytical difficulty. Identical fragment ions (boxed) can be produced from different kindred forms. A specific product ion could then be matched following data processing to different, possibly incorrect, proteoform species (as exemplified by the product ion in the green box) which is exacerbated in lower resolution data.
Figure 3
Figure 3
Combinations of theoretical isobaric proteoforms. The number of potential proteoforms for a given base protein sequence of a defined precursor mass (C) is dependent on the number of potential action sites (in this case phosphorylation; n = 4) and the number of observed events (r = either 1 (top) or 2 (bottom)) as defined by the difference in mass between the base protein (theoretical) and the experimentally observed proteoform.
Figure 4
Figure 4
Effect of isolation width on the generation of chimeric MS spectra. Kindred species (left) that contain the same base sequence and the same types of modification (P, phosphorylation) but in different quantities increase analytical complexity. (A) A narrow isolation window for MS2 prevents coisolation of similar species, thus generating isoform-specific MS2 spectra, but does not overcome the issue of isobaric proteoforms. (B) Too wide an isolation window can lead to ambiguity for precursor assignment as kindred species yield identical (and similar) fragment ions. Red ions in the MS2 spectrum equate to fragment ions that could derive from a number of kindred proteoform.
Figure 5
Figure 5
Proteoform selection for MS2 with data-dependent acquisition (DDA). Where precursor signal intensity is used to select ions for fragmentation, the presence of multiple charge states (z, z – 1, z – 2, z – 3) of a single proteoform (represented here as either red, blue, and green) can result in isolation and fragmentation of different charge states of the same species, reducing diversity of information and the number of proteoforms that can be identified. In this example, representative of a Top10 DDA experiment of kindred proteoforms, two charge states (z – 1; z) of the 3 phosphate-containing species (blue) are selected before the most intense charge state (z) of the doubly phosphorylated proteoform (green). The singly phosphorylated proteoform is the 9th most abundant ion in this example and thus may not be selected depending on the duty cycle. MS/MS spectra of different charge states of the same proteoform (e.g., DDA1 and DDA2) may differ marginally due to slight differences in protonation density and gas-phase conformation.

Similar articles

Cited by

References

    1. Wilkins M. R.; Sanchez J. C.; Gooley A. A.; Appel R. D.; Humphery-Smith I.; Hochstrasser D. F.; Williams K. L. Progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it. Biotechnol Genet Eng. Rev. 1996, 13, 19–50. 10.1080/02648725.1996.10647923. - DOI - PubMed
    1. Wilm M.; Shevchenko A.; Houthaeve T.; Breit S.; Schweigerer L.; Fotsis T.; Mann M. Femtomole sequencing of proteins from polyacrylamide gels by nano-electrospray mass spectrometry. Nature 1996, 379 (6564), 466–469. 10.1038/379466a0. - DOI - PubMed
    1. Eng J. K.; McCormack A. L.; Yates J. R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 1994, 5 (11), 976–989. 10.1016/1044-0305(94)80016-2. - DOI - PubMed
    1. Kelleher N. L. Top-down proteomics. Anal. Chem. 2004, 76 (11), 196A–203A. 10.1021/ac0415657. - DOI - PubMed
    1. Smith L. M.; Kelleher N. L. Proteoform: a single term describing protein complexity. Nat. Methods 2013, 10 (3), 186–187. 10.1038/nmeth.2369. - DOI - PMC - PubMed

Publication types

MeSH terms

-