Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

doi:10.1021/acs.chemrev.3c00550

Review

. 2024 Apr 10;124(7):3932-3977.

doi: 10.1021/acs.chemrev.3c00550. Epub 2024 Mar 27.

Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

Greta Grassmann^{1

2}, Mattia Miotto², Fausta Desantis^{2

3}, Lorenzo Di Rienzo², Gian Gaetano Tartaglia^{2

4

5}, Annalisa Pastore⁶, Giancarlo Ruocco^{2

7}, Michele Monti⁸, Edoardo Milanetti^{2

7}

Affiliations

¹ Department of Biochemical Sciences "Alessandro Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy.
² Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Rome 00161, Italy.
³ The Open University Affiliated Research Centre at Istituto Italiano di Tecnologia, Genoa 16163, Italy.
⁴ Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy.
⁵ Center for Human Technologies, Genoa 16152, Italy.
⁶ Experiment Division, European Synchrotron Radiation Facility, Grenoble 38043, France.
⁷ Department of Physics, Sapienza University, Rome 00185, Italy.
⁸ RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy.

PMID: 38535831
PMCID: PMC11009965
DOI: 10.1021/acs.chemrev.3c00550

Review

Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

Greta Grassmann et al. Chem Rev. 2024.

. 2024 Apr 10;124(7):3932-3977.

doi: 10.1021/acs.chemrev.3c00550. Epub 2024 Mar 27.

Authors

Affiliations

¹ Department of Biochemical Sciences "Alessandro Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy.
² Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Rome 00161, Italy.
³ The Open University Affiliated Research Centre at Istituto Italiano di Tecnologia, Genoa 16163, Italy.
⁴ Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy.
⁵ Center for Human Technologies, Genoa 16152, Italy.
⁶ Experiment Division, European Synchrotron Radiation Facility, Grenoble 38043, France.
⁷ Department of Physics, Sapienza University, Rome 00185, Italy.
⁸ RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy.

PMID: 38535831
PMCID: PMC11009965
DOI: 10.1021/acs.chemrev.3c00550

Abstract

Investigating protein-protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein-protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein-protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

**Figure 1**
Schematic diagram of crowding inside a cell and of its effect across protein binding. (a) Scientific illustration of the macromolecules inside a cell of *Escherichia coli*, inspired by the figure presented by Goodsell with the cytoplasm in blue and purple and the cell membrane in yellow. The magnified portion is a magnification of the cytoplasm constituents. (b) Outline of the effects of crowding on protein folding, diffusion, and binding (from top to bottom) inside the cell. Crowding effects can be divided into volume exclusion and soft interactions. The former has a repulsive nature, which tends to enhance the stability of folding and binding and decrease the translational diffusion, as summarized in the first column. Soft interactions can instead be classified as attractive or repulsive (second and third column, respectively). Attractive soft interactions tend to counteract volume exclusion, which has a destabilizing effect on both folding and binding. Conversely, repulsive soft interactions have a stabilizing effect. Both interactions hinder rotational and translational diffusion.

**Figure 2**
Coexistence and cooperation between experimental and numerical studies. (a) Number of papers whose abstract or title include the terms “crowding” and “simulation” (“experiment”) as a function of time since 1981 (the year in which Minton defined the concept of cellular crowding), colored in orange (yellow). The papers were extracted from the Dimensions database. (b) On top, EIN (orange) and HPr (blue) in complex (PDB ID: 3EZA). In the gray box, the HPr residues interacting with EIN are highlighted in orange. The same residues were experimentally shown by Dong et al. to interact with BSA crowders. On the bottom is a cartoon representation of the complex formed by HPr (blue) and BSA (green). The structure was obtained by docking the isolated structure of HPr (PDB ID: 1POH) and BSA (PDB ID: 4F5S) with HADDOCK. (c) The same search as in (b) was performed, but in this case, in addition to “crowding,” the words “molecular dynamics” (MD), “Brownian dynamics” (BD), “Monte Carlo” (MC), and “scaled particle theory” (SPT) were looked for. The number of published papers is shown in light blue, blue, dark blue, and violet, respectively.

**Figure 3**
Sketch of the four main questions of protein–protein interaction predictions. Methods can be developed to assess (i) whether and/or (ii) where two proteins functionally interact. Other techniques can assess (iii) the dynamics and outcome of the binding process and (iv) the binding stability of the resulting protein–protein complex. An overview of the methods that have been developed to address these different tasks is provided in Table 1.

**Figure 4**
Diagram of the most common techniques used to predict binding interfaces and poses by the methods reported in Table 1. The available servers and codes for binding interfaces and pose prediction are often based on the evaluation of shape complementarity, the minimization of an energy score, sequence and/or structure-based ML, and/or homology modeling. Shape complementarity (top left box) can be searched for with geometric hashing or orthogonal polynomials decomposition. The former defines geometric patches (concave, flat, convex) with discrete points of the protein surface and uses those points to match the partner’s patches stored in a hash table. Partners’ surfaces can also be compared by expanding the surface patches in terms of orthogonal polynomials (e.g., Zernike polynomials) and computing the distance between the corresponding vectors. Both 2D and 3D expansion have been used. Many methods aim at minimizing an energy expression (including, for example, van der Waals energy, electrostatic interaction energy, and a statistical pairwise potential representing other solvation effects). This minimization can be achieved by testing different orientations and spatial positions of the binding partners and computing the energy term for each step. Such exploration is often performed in real space or through a FFT correlation approach. With this technique, the interaction matrix is approximated by its dominant eigenvectors so that the energy expression is written as the sum of a few correlation functions, and the minimization is solved by repeated FFT calculations. A quick solution to the minimization problem can be achieved with particle swarm optimization (PSO). If homologous structures are available, possible binding interfaces can be obtained by either sequence or structural similarity between the studied proteins surfaces and known binding sites of homologous complexes. Finally, the binding sites scoring problem is often faced with ML methods. ML techniques can be divided into sequence-based, structure-based, and combined. All categories are based on representing the protein features with a vector that is then passed to a network. Many networks and learning algorithms have been used.

**Figure 5**
Crowding effects on the different stages of protein binding. (a) Proteins (orange and red) can be found inside the cell as single monomers whose folding and conformational variation are influenced by surrounding crowders (gray). The monomers have to navigate this crowded environment to find each other and bind. The dynamics leading to this binding are affected by the crowders, as well as by the stability and dynamics of the formed dimer. (b) The evaluation of many features of the structure and dynamics of proteins (orange) surrounded by crowders (gray) can be performed with computational methods. The results depend on the level of approximation by which the system has been described. In the most simplified models, both proteins and crowders are described as rigid spheres. More detailed representations (at the cost of higher computational costs) can be obtained with coarse-grained (CG) models. In the lowest-resolution CG systems, crowders are represented as hard spheres and proteins by subsuming multiple atoms into beads or at an atomistic level. More detailed CG studies employ an atomistic description of proteins with crowders as a collection of beads corresponding to multiple atoms. (c) Computer simulations and theoretical calculations have been extensively implemented to study the dynamics of proteins and interacting partners in a crowded environment. The main factors influencing the structure of single proteins characterized by these studies up to now are hard-core repulsion, dielectric response, and electrostatic and hydrophobic interactions. Diffusion, instead, is mainly affected by collisions with the crowders, hydrodynamic interactions, cluster formation, and crowders depletion. In a similar way as for single protein folding, protein binding has been found to depend on hard-core repulsion and nonspecific binding between interacting proteins and crowders. For what concerns the stability of the resulting complex, currently, only the effect of hard-core repulsion has been characterized.

**Figure 6**
Analysis of techniques used to simulate the crowded environment in the studies reported in Table 2. (a) Each row shows the timeline of one of the most popular software or packages on which crowded systems have been simulated, including AMBER, GROMACS, NAMD, PROFASI, CHARMM, MUPHY, BioSimZ, BD-BOX, GENESIS, OpenMM, and ReaDDy. Each row starts from the year in which the software or package was made available. The years in which a particularly cited or recent paper used that software are marked with some letters indicating the level of representation chosen for the simulated system, as indicated in the legend. (b) Timeline of the force field implemented in the most cited and recent papers to simulate crowded environments. On the left, each force field is reported in a different color at the year in which it was developed. On the right, the same color is used to highlight the papers implementing that force field. Each paper is indicated with an acronym referring to how the crowders and proteins were represented: all-atom (A) or coarse-grained (G). The year of publication of each paper is reported, as well. (c) Each pie chart refers to the main studies discussed in this review. The top left (right) graph shows the software (force fields) that has been implemented, together with the percentage of papers using each of them. The bottom left (right) plot shows which crowders (representation) have been chosen by what percentage of works.

**Figure 7**
Diagram of the most common techniques used to facilitate the simulation of a crowded environment. On the left are some of the most used sampling enhancement techniques: metadynamics (top) and REM simulations (bottom). In the former, a repulsive bias potential function, shown in gray in the plot, is added to the free energy function V(x), where x is the collective variables (CVs) describing the system (the red dot). This bias discourages the system from revisiting already sampled configurations, which accelerates the exploration of the full energy landscape. In REM simulations, several independent trajectories (called replicas, Rep in the figure) are simultaneously generated. During the simulation, neighboring replicas are exchanged according to specific acceptance criteria. In this way, the trajectory can explore different equilibrium conditions and overcome slow relaxation. Another way to decrease the computational costs is by reducing the number of system components that have to be simulated by either using an implicit solvent or GC model, which are represented in the middle of the figure on top and on bottom, respectively. Implicit solvent models represent the water (blue line) surrounding the protein (orange) as a continuous medium instead of individual molecules. In CG models, molecules are represented by pseudoatoms approximating groups of individual atoms. This results in a smoother energy landscape with fewer local minima, which enables an easier exploration. Finally, faster simulations can be obtained by approximating the interactions: in the context of crowding, many studies chose to only consider steric repulsion to investigate its entropic effect.

**Figure 8**
Schematic representation of six different computational studies of the protein Trp-cage structure and dynamics. On top, three articles (from left to right: (, , and 393)) that show an example of how the computational investigation of the dynamic in crowded conditions of a case study protein, in this case Trp-cage, has evolved over time. For each paper, the most relevant bibliographic data are reported together with a representative illustration, and the year of publication is marked on the timeline. The same information is shown for the papers on the bottom. The study on the left refers to the insertion of the Trp-cage structure in the PDB databank; the other two (from left to right: ( and 404)) show the evolution of computational methods that investigated its dynamic without considering crowding.

**Figure 9**
Levels of representation employed to study diffusion and reaction rates in a crowded environment. (a) The GFRD algorithm defines a sphere around proteins (orange and green), crowders (violet), and solvent (blue) and computes for each molecule the probability distribution of the time and the position of leaving this volume. Each molecule is thus associated with a next event type and a next event time, which are put and executed in chronological order. Following execution, the molecule is propagated, and a new sphere with new events is determined. (b) In the simplest lattice model, space is discretized into voxels (gray graph), and proteins (orange and green) and crowders (violet) can only move to a neighboring voxel. The solvent is either modeled implicitly or not considered. (c) A more realistic representation can be obtained when all the system components are left free to move, and the solvent is explicitly modeled.

**Figure 10**
Summary of the main effects of crowding on protein–protein interactions observed *in silico*. A recap of how crowding influences protein structure, diffusion, complex formation, and stability is reported. In each box, the observations are divided into sub-boxes according to which crowding effects have been considered.

See this image and copyright information in PMC

References

1. Minton A. P. Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers: Original Research on Biomolecules 1981, 20, 2093–2120. 10.1002/bip.1981.360201006. - DOI
1. Minton A.; Wilf J. Effect of macromolecular crowding upon the structure and function of an enzyme: glyceraldehyde-3-phosphate dehydrogenase. Biochemistry 1981, 20, 4821–4826. 10.1021/bi00520a003. - DOI - PubMed
1. Ellis R. J. Macromolecular crowding: an important but neglected aspect of the intracellular environment. Curr. Opin. Struct. Biol. 2001, 11, 114–119. 10.1016/S0959-440X(00)00172-X. - DOI - PubMed
1. Ellis R. J.; Minton A. P. Join the crowd. Nature 2003, 425, 27–28. 10.1038/425027a. - DOI - PubMed
1. Zimmerman S. B.; Trach S. O. Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 1991, 222, 599–620. 10.1016/0022-2836(91)90499-V. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

[1] Minton A. P. Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers: Original Research on Biomolecules 1981, 20, 2093–2120. 10.1002/bip.1981.360201006. - DOI

[2] Minton A. P. Excluded volume as a determinant of macromolecular structure and reactivity. Biopolymers: Original Research on Biomolecules 1981, 20, 2093–2120. 10.1002/bip.1981.360201006. - DOI

[3] Minton A.; Wilf J. Effect of macromolecular crowding upon the structure and function of an enzyme: glyceraldehyde-3-phosphate dehydrogenase. Biochemistry 1981, 20, 4821–4826. 10.1021/bi00520a003. - DOI - PubMed

[4] Minton A.; Wilf J. Effect of macromolecular crowding upon the structure and function of an enzyme: glyceraldehyde-3-phosphate dehydrogenase. Biochemistry 1981, 20, 4821–4826. 10.1021/bi00520a003. - DOI - PubMed

[5] Ellis R. J. Macromolecular crowding: an important but neglected aspect of the intracellular environment. Curr. Opin. Struct. Biol. 2001, 11, 114–119. 10.1016/S0959-440X(00)00172-X. - DOI - PubMed

[6] Ellis R. J. Macromolecular crowding: an important but neglected aspect of the intracellular environment. Curr. Opin. Struct. Biol. 2001, 11, 114–119. 10.1016/S0959-440X(00)00172-X. - DOI - PubMed

[7] Ellis R. J.; Minton A. P. Join the crowd. Nature 2003, 425, 27–28. 10.1038/425027a. - DOI - PubMed

[8] Ellis R. J.; Minton A. P. Join the crowd. Nature 2003, 425, 27–28. 10.1038/425027a. - DOI - PubMed

[9] Zimmerman S. B.; Trach S. O. Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 1991, 222, 599–620. 10.1016/0022-2836(91)90499-V. - DOI - PubMed

[10] Zimmerman S. B.; Trach S. O. Estimation of macromolecule concentrations and excluded volume effects for the cytoplasm of Escherichia coli. J. Mol. Biol. 1991, 222, 599–620. 10.1016/0022-2836(91)90499-V. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

Affiliations

Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources