Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 12;18(4):2543-2555.
doi: 10.1021/acs.jctc.1c00924. Epub 2022 Feb 23.

Combined Free-Energy Calculation and Machine Learning Methods for Understanding Ligand Unbinding Kinetics

Affiliations

Combined Free-Energy Calculation and Machine Learning Methods for Understanding Ligand Unbinding Kinetics

Magd Badaoui et al. J Chem Theory Comput. .

Abstract

The determination of drug residence times, which define the time an inhibitor is in complex with its target, is a fundamental part of the drug discovery process. Synthesis and experimental measurements of kinetic rate constants are, however, expensive and time consuming. In this work, we aimed to obtain drug residence times computationally. Furthermore, we propose a novel algorithm to identify molecular design objectives based on ligand unbinding kinetics. We designed an enhanced sampling technique to accurately predict the free-energy profiles of the ligand unbinding process, focusing on the free-energy barrier for unbinding. Our method first identifies unbinding paths determining a corresponding set of internal coordinates (ICs) that form contacts between the protein and the ligand; it then iteratively updates these interactions during a series of biased molecular dynamics (MD) simulations to reveal the ICs that are important for the whole of the unbinding process. Subsequently, we performed finite-temperature string simulations to obtain the free-energy barrier for unbinding using the set of ICs as a complex reaction coordinate. Importantly, we also aimed to enable the further design of drugs focusing on improved residence times. To this end, we developed a supervised machine learning (ML) approach with inputs from unbiased "downhill" trajectories initiated near the transition state (TS) ensemble of the string unbinding path. We demonstrate that our ML method can identify key ligand-protein interactions driving the system through the TS. Some of the most important drugs for cancer treatment are kinase inhibitors. One of these kinase targets is cyclin-dependent kinase 2 (CDK2), an appealing target for anticancer drug development. Here, we tested our method using two different CDK2 inhibitors for the potential further development of these compounds. We compared the free-energy barriers obtained from our calculations with those observed in available experimental data. We highlighted important interactions at the distal ends of the ligands that can be targeted for improved residence times. Our method provides a new tool to determine unbinding rates and to identify key structural features of the inhibitors that can be used as starting points for novel design strategies in drug discovery.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Illustration of the simulation system. (a) CDK2 bound to two different ligands: (b) thiazolyl-pyrimidine derivative (18K) and (c) carboxylate oxindole derivative (62K), originated from PDB structures 3sw4 and 4fkw, respectively. Structural details of ATP pockets are shown for the two systems (bottom), with the ligands in the bound (green sticks), unbound (red sticks), and transition states (gray sticks). Dashed lines depict key interactions.
Figure 2
Figure 2
Flowchart illustrating the steps for the unbinding protocol.
Figure 3
Figure 3
Comparison between GBDT (top) and MLTSA with NN (bottom) feature analysis methods for the 1 DW data set. Correlated features are marked from blue (0%) to red (100%) depending on the mixing coefficient, α (× symbols, color scale on the right, five highest mixing coefficients are also displayed for the data points). Uncorrelated features (small black symbols) are at 0 FI for GBDT and show no loss of accuracy for MLTSA with MLPs. Correlated features all show a significant accuracy drop for the MLP, while only the top correlated features have high FI using GBDT.
Figure 4
Figure 4
PMF of the unbinding path for 18K (a) and 62K (b). The free energy profile is obtained from a representative replica; the standard error, shown as a shaded area, is obtained by dividing the full data set into four subgroups.
Figure 5
Figure 5
(a) Unbinding trajectory of ligand 62K represented as selected snapshots along the trajectory at 0, 70, 141, and 219 ns from left to right, respectively. Representative distances used for the bias are shown as colored dashed lines (for the full set of distances, please refer to Figure S2.I–VI). Representative distances included in the CV along the unbinding trajectory are shown in (b) and the corresponding distance values plotted in (c). The lower dashed line at 3.5 Å is the cutoff value below which an interaction is included in the main CV; the upper cutoff at 6 Å is the value above which the distance is excluded from the CV.
Figure 6
Figure 6
CVs obtained from the unbinding of 18K (a) and 62K (b); representative distances shown in dashed lines (yellow: interaction from the initial structure, cyan: interaction found during the unbinding trajectory); red sticks represent the coordinate of the ligand when it is outside the pocket. These distances appear in each of the three replicas for each system.
Figure 7
Figure 7
Representation of the PMF of ligand 62K along the string coordinate and the path of multiple downhill trajectories started at the TS (in green) for further analysis. From the TS coordinate as a starting point, a set of simulations leading to both an IN position (blue) and an OUT position (red) are represented as lines. The green dots illustrate the free energy profile data points obtained from the WHAM calculation using the string window as string coordinate. The green line represents the fitting obtained from the green dots. The yellow shade represents the simulation time portion used for analysis during our machine learning-based approach.
Figure 8
Figure 8
Identification of the essential distances (feature reduction) from the largest accuracy drop using the last 50% (yellow), 25% (red), and 10% (blue) of the frames up to the first 0.3 ns of the simulations for (a) 18K and (c) 62K. The different shades in the background group the different features according to the atom of the ligand involved. Features presenting a significant decrease in accuracy are labeled (see Table S2.II) and portrayed as a three-dimensional (3D) representation on the right side of each plot: (b) 18K and (d) 62K.

Similar articles

Cited by

References

    1. Copeland R. A.Evaluation of Enzyme Inhibitors in Drug Discovery: A Guide for Medicinal Chemists and Pharmacologists, 2nd ed.; John Wiley and Sons, 2013. - PubMed
    1. Copeland R. A. The Drug-Target Residence Time Model: A 10-Year Retrospective. Nat. Rev. Drug Discovery 2016, 15, 87–95. 10.1038/nrd.2015.18. - DOI - PubMed
    1. Bernetti M.; Masetti M.; Rocchia W.; Cavalli A. Kinetics of Drug Binding and Residence Time. Annu. Rev. Phys. Chem. Annu. Rev. Phys. Chem. 2019, 70, 143–171. 10.1146/annurev-physchem-042018-052340. - DOI - PubMed
    1. Copeland R. A.; Pompliano D. L.; Meek T. D. Drug–Target Residence Time and Its Implications for Lead Optimization. Nat. Rev. Drug Discovery 2006, 5, 730–739. 10.1038/nrd2082. - DOI - PubMed
    1. Lu H.; Tonge P. J. Drug-Target Residence Time: Critical Information for Lead Optimization. Curr. Opin. Chem. Biol. 2010, 14, 467–474. 10.1016/j.cbpa.2010.06.176. - DOI - PMC - PubMed

LinkOut - more resources

-