Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb;33(2):e4862.
doi: 10.1002/pro.4862.

Flexible protein-protein docking with a multitrack iterative transformer

Affiliations

Flexible protein-protein docking with a multitrack iterative transformer

Lee-Shin Chu et al. Protein Sci. 2024 Feb.

Abstract

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, for example, structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top-1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody-antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold-Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.

Keywords: deep learning; flexible protein docking; protein-protein interaction.

PubMed Disclaimer

Conflict of interest statement

Jeffrey Gray is an unpaid board member (director) of the Rosetta Commons. Under institutional participation agreements between the University of Washington, acting on behalf of the Rosetta Commons, Johns Hopkins University may be entitled to a portion of revenue received on licensing Rosetta software including some methods described in this study. Jeffrey Gray has a financial interest in Cyrus Biotechnology. Cyrus Biotechnology distributes the Rosetta software, which may include methods described in this study. The results of the study discussed in this article could affect the value of Cyrus Biotechnology. These arrangements have been reviewed and approved by the Johns Hopkins University in accordance with its conflict‐of‐interest policies.

Update of

Similar articles

Cited by

References

    1. Abagyan R, Totrov M, Kuznetsov D. Icm—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem. 1994;15(5):488–506.
    1. Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13(6):3031–3048. - PMC - PubMed
    1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. - PMC - PubMed
    1. Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three‐track neural network. Science. 2021;373(6557):871–876. - PMC - PubMed
    1. Bastard K, Prévost C, Zacharias M. Accounting for loop flexibility during protein–protein docking. Proteins. 2006;62(4):956–969. - PubMed

LinkOut - more resources

-