Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Aug 19;16(1):617.
doi: 10.1186/s12864-015-1820-x.

PhosphOrtholog: a web-based tool for cross-species mapping of orthologous protein post-translational modifications

Affiliations

PhosphOrtholog: a web-based tool for cross-species mapping of orthologous protein post-translational modifications

Rima Chaudhuri et al. BMC Genomics. .

Abstract

Background: Most biological processes are influenced by protein post-translational modifications (PTMs). Identifying novel PTM sites in different organisms, including humans and model organisms, has expedited our understanding of key signal transduction mechanisms. However, with increasing availability of deep, quantitative datasets in diverse species, there is a growing need for tools to facilitate cross-species comparison of PTM data. This is particularly important because functionally important modification sites are more likely to be evolutionarily conserved; yet cross-species comparison of PTMs is difficult since they often lie in structurally disordered protein domains. Current tools that address this can only map known PTMs between species based on known orthologous phosphosites, and do not enable the cross-species mapping of newly identified modification sites. Here, we addressed this by developing a web-based software tool, PhosphOrtholog ( www.phosphortholog.com ) that accurately maps protein modification sites between different species. This facilitates the comparison of datasets derived from multiple species, and should be a valuable tool for the proteomics community.

Results: Here we describe PhosphOrtholog, a web-based application for mapping known and novel orthologous PTM sites from experimental data obtained from different species. PhosphOrtholog is the only generic and automated tool that enables cross-species comparison of large-scale PTM datasets without relying on existing PTM databases. This is achieved through pairwise sequence alignment of orthologous protein residues. To demonstrate its utility we apply it to two sets of human and rat muscle phosphoproteomes generated following insulin and exercise stimulation, respectively, and one publicly available mouse phosphoproteome following cellular stress revealing high mapping and coverage efficiency. Although coverage statistics are dataset dependent, PhosphOrtholog increased the number of cross-species mapped sites in all our example data sets by more than double when compared to those recovered using existing resources such as PhosphoSitePlus.

Conclusions: PhosphOrtholog is the first tool that enables mapping of thousands of novel and known protein phosphorylation sites across species, accessible through an easy-to-use web interface. Identification of conserved PTMs across species from large-scale experimental data increases our knowledgebase of functional PTM sites. Moreover, PhosphOrtholog is generic being applicable to other PTM datasets such as acetylation, ubiquitination and methylation.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
User Interface Snapshots. a The instructions for generating the input data format, including each column description is described in “Step #1” in the PhosphOrtholog main page. The input interface also shows an example of the required data format in the table below the text “For example”. The data in the example table can be used as input by clicking the “Use above example” button. Mapping of this data can be completed by clicking “Map”. Input data can also be simply copy-pasted/edited/deleted on the user interface (UI) spreadsheet like with an Excel spreadsheet in the “Preview for input data set” table. Three separate example input files can also be downloaded through the ‘download’ links immediately below the example data table and uploaded to the UI through the “Upload” button. User provided datasets (in comma-delimited format) can be uploaded for mapping via the “Upload” button/copy-pasted into the preview input table or typed in. b Output Interface: Once mapping is ensued with the ‘Map’ button in “Step # 2”, the progress bar above the output table in “Step # 3” tracks the progress of the mapping function. This will give a rough estimate of how long the job will take to finish for large data sets. The first two columns in the mapped output table indicates the species 1 record identifier and PTM site details which is mapped to the orthologous species 2 site information shown in the third and fourth columns. The last column indicates the E-value significance score from the pairwise sequence alignement of the orthologous proteins. If the PTM site is a known mapped site from PhosphoSitePlus database, then this column reports “From PhosphoSitePlus” instead of a E-value. Once mapping is complete, this bar also reports the number of novel sites mapped by PhosphOrtholog, the percentage of novel sites that could be mapped in the data set and the percentage of known sites from PhosphoSitePlus that could be recovered by PhosphOrtholog
Fig. 2
Fig. 2
Software architechture. The four layers of the software implementation procedure and the communications between the layers are illustrated. The storage layer shows the six reference ortholog mapping databases where the species are abbreviated by their first letter, for e.g. human by H, rat by R, mouse by M and fly by F. The database of annotated PTM sites obtained from PhosphoSitePlus is represented as PSP
Fig. 3
Fig. 3
The algorithmic workflow. The schematic representation of the algorithm is depicted through the flowchart. The four stages through which the input data is analyzed to return the mapped sites are displayed
Fig. 4
Fig. 4
Role of PhosphOrtholog in the MS-based PTM data analysis pipeline. Illustration of the broadly divided four stages of MS-based PTM experiments, in Stage 1, sample extraction and preparation tasks are conducted from human and rat muscle tissues for the MS-based phosphoproteomics experiment. Stage 2 marks the raw spectral data analysis to generate peptide and protein annotations along with intensity measures for the PTMs induced by the experimental design in each species. In Stage 3, output from Stage 2 is parsed to extract information such as the leading Uniprot ID (‘Uniprot_ACC’), modified amino acid type (‘AminoAcid’) and modification site number (‘Site#’) from each species and concatenated in the desired input format for PhosphOrtholog mapping. We showcase the PTM examples for proteins ULK1 (2 sites in human and rat) and ACACA (3 sites in human and rat) here; column ‘ModificationSite’ indicates the peptide sequence with the identified PTM site and the probability of particular amino acids being phosphorylated by the number within the parenthesis. In Stage 4, the sites mapped by PhosphOrtholog are obtained, which are either annotated as newly mapped with a calculated E-value (4 out of 5 input sites were not mapped before, identified with E-value of 0) or with “From PhosphoSitePlus” if the mapping was previously known (mapping between human ACACA site S80 and rat ACACA site S79 is annotated in PhosphoSitePlus database)
Fig. 5
Fig. 5
Increased coverage of common sites. Shows the utility and efficiency of PhosphOrtholog compared to PhosphoSitePlus for three example datasets comprising human, rat and mouse phosphoproteomes. The coverage of conserved sites identified by PhosphOrtholog when compared to PhosphoSitePlus was increased by 136 % (from 83 annotated sites in PhosphoSitePlus to 196 mapped sites, an additional 113 novel orthologous PTM site matches) in dataset 1 and by 148 % (from 473 to 1174 mapped sites, an increase of 701 novel site matches) in dataset 2 and by 177 % (from 475 to 1315 sites, thereby adding 840 novel sites matches) in dataset 3

Similar articles

Cited by

References

    1. Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, et al. Distinguishing protein-coding and noncoding genes in the human genome. Proc Natl Acad Sci. 2007;104(49):19428–19433. doi: 10.1073/pnas.0709013104. - DOI - PMC - PubMed
    1. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The Protein Kinase Complement of the Human Genome. Sci. 2002;298(5600):1912–1934. doi: 10.1126/science.1075762. - DOI - PubMed
    1. Boersema PJ, Foong LY, Ding VMY, Lemeer S, van Breukelen B, Philp R, et al. In-depth Qualitative and Quantitative Profiling of Tyrosine Phosphorylation Using a Combination of Phosphopeptide Immunoaffinity Purification and Stable Isotope Dimethyl Labeling. Mol Cell Proteomics. 2010;9(1):84–99. doi: 10.1074/mcp.M900291-MCP200. - DOI - PMC - PubMed
    1. Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, et al. PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res. 2012;40(D1):D261–D270. doi: 10.1093/nar/gkr1122. - DOI - PMC - PubMed
    1. Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2014. - PMC - PubMed

Publication types

-