Enhancing generalizability and performance in drug-target interaction identification by integrating pharmacophore and pre-trained models
- PMID: 38940179
- PMCID: PMC11211825
- DOI: 10.1093/bioinformatics/btae240
Enhancing generalizability and performance in drug-target interaction identification by integrating pharmacophore and pre-trained models
Abstract
Motivation: In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information.
Results: Therefore, we propose a novel DTA prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study.
Availability and implementation: The source code and data are available at https://github.com/daydayupzzl/HeteroDTA.
© The Author(s) 2024. Published by Oxford University Press.
Similar articles
-
Prediction of drug-target binding affinity based on deep learning models.Comput Biol Med. 2024 May;174:108435. doi: 10.1016/j.compbiomed.2024.108435. Epub 2024 Apr 8. Comput Biol Med. 2024. PMID: 38608327 Review.
-
G-K BertDTA: A graph representation learning and semantic embedding-based framework for drug-target affinity prediction.Comput Biol Med. 2024 May;173:108376. doi: 10.1016/j.compbiomed.2024.108376. Epub 2024 Mar 25. Comput Biol Med. 2024. PMID: 38552281
-
Drug-target affinity prediction with extended graph learning-convolutional networks.BMC Bioinformatics. 2024 Feb 16;25(1):75. doi: 10.1186/s12859-024-05698-6. BMC Bioinformatics. 2024. PMID: 38365583 Free PMC article.
-
Drug-target affinity prediction method based on multi-scale information interaction and graph optimization.Comput Biol Med. 2023 Dec;167:107621. doi: 10.1016/j.compbiomed.2023.107621. Epub 2023 Oct 29. Comput Biol Med. 2023. PMID: 37907030
-
GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery.BMC Bioinformatics. 2022 Sep 7;23(1):367. doi: 10.1186/s12859-022-04905-6. BMC Bioinformatics. 2022. PMID: 36071406 Free PMC article.
References
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources