Flexible protein-protein docking with a multitrack iterative transformer

doi:10.1002/pro.4862

. 2024 Feb;33(2):e4862.

doi: 10.1002/pro.4862.

Flexible protein-protein docking with a multitrack iterative transformer

Lee-Shin Chu¹, Jeffrey A Ruffolo², Ameya Harmalkar¹, Jeffrey J Gray^{1

2}

Affiliations

¹ Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA.
² Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, USA.

PMID: 38148272
PMCID: PMC10804679 (available on 2025-02-01)
DOI: 10.1002/pro.4862

Flexible protein-protein docking with a multitrack iterative transformer

Lee-Shin Chu et al. Protein Sci. 2024 Feb.

. 2024 Feb;33(2):e4862.

doi: 10.1002/pro.4862.

Authors

Lee-Shin Chu¹, Jeffrey A Ruffolo², Ameya Harmalkar¹, Jeffrey J Gray^{1

2}

Affiliations

¹ Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA.
² Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, USA.

PMID: 38148272
PMCID: PMC10804679 (available on 2025-02-01)
DOI: 10.1002/pro.4862

Abstract

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, for example, structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top-1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody-antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold-Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.

Keywords: deep learning; flexible protein docking; protein-protein interaction.

PubMed Disclaimer

Conflict of interest statement

Jeffrey Gray is an unpaid board member (director) of the Rosetta Commons. Under institutional participation agreements between the University of Washington, acting on behalf of the Rosetta Commons, Johns Hopkins University may be entitled to a portion of revenue received on licensing Rosetta software including some methods described in this study. Jeffrey Gray has a financial interest in Cyrus Biotechnology. Cyrus Biotechnology distributes the Rosetta software, which may include methods described in this study. The results of the study discussed in this article could affect the value of Cyrus Biotechnology. These arrangements have been reviewed and approved by the Johns Hopkins University in accordance with its conflict‐of‐interest policies.

Update of

Flexible Protein-Protein Docking with a Multi-Track Iterative Transformer.
Chu LS, Ruffolo JA, Harmalkar A, Gray JJ. Chu LS, et al. bioRxiv [Preprint]. 2023 Jul 1:2023.06.29.547134. doi: 10.1101/2023.06.29.547134. bioRxiv. 2023. Update in: Protein Sci. 2024 Feb;33(2):e4862. doi: 10.1002/pro.4862. PMID: 37425754 Free PMC article. Updated. Preprint.

Cited by

ABAG-docking benchmark: a non-redundant structure benchmark dataset for antibody-antigen computational docking.
Zhao N, Han B, Zhao C, Xu J, Gong X. Zhao N, et al. Brief Bioinform. 2024 Jan 22;25(2):bbae048. doi: 10.1093/bib/bbae048. Brief Bioinform. 2024. PMID: 38385879 Free PMC article.
Protein-protein interfaces in molecular glue-induced ternary complexes: classification, characterization, and prediction.
Rui H, Ashton KS, Min J, Wang C, Potts PR. Rui H, et al. RSC Chem Biol. 2023 Jan 3;4(3):192-215. doi: 10.1039/d2cb00207h. eCollection 2023 Mar 8. RSC Chem Biol. 2023. PMID: 36908699 Free PMC article. Review.

References

1. Abagyan R, Totrov M, Kuznetsov D. Icm—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem. 1994;15(5):488–506.
1. Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13(6):3031–3048. - PMC - PubMed
1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. - PMC - PubMed
1. Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three‐track neural network. Science. 2021;373(6557):871–876. - PMC - PubMed
1. Bastard K, Prévost C, Zacharias M. Accounting for loop flexibility during protein–protein docking. Proteins. 2006;62(4):956–969. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Grants and funding

R35 GM141881/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Wiley
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Abagyan R, Totrov M, Kuznetsov D. Icm—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem. 1994;15(5):488–506.

[2] Abagyan R, Totrov M, Kuznetsov D. Icm—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem. 1994;15(5):488–506.

[3] Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13(6):3031–3048. - PMC - PubMed

[4] Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13(6):3031–3048. - PMC - PubMed

[5] Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. - PMC - PubMed

[6] Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. - PMC - PubMed

[7] Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three‐track neural network. Science. 2021;373(6557):871–876. - PMC - PubMed

[8] Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three‐track neural network. Science. 2021;373(6557):871–876. - PMC - PubMed

[9] Bastard K, Prévost C, Zacharias M. Accounting for loop flexibility during protein–protein docking. Proteins. 2006;62(4):956–969. - PubMed

[10] Bastard K, Prévost C, Zacharias M. Accounting for loop flexibility during protein–protein docking. Proteins. 2006;62(4):956–969. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Flexible protein-protein docking with a multitrack iterative transformer

Affiliations

Flexible protein-protein docking with a multitrack iterative transformer

Authors

Affiliations

Abstract

Conflict of interest statement

Update of

Similar articles

Cited by

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Abstract

Conflict of interest statement

Update of

Similar articles

Cited by

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials