Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 23;15(1):27.
doi: 10.1186/s13321-023-00700-4.

Double-head transformer neural network for molecular property prediction

Affiliations

Double-head transformer neural network for molecular property prediction

Yuanbing Song et al. J Cheminform. .

Abstract

Existing molecular property prediction methods based on deep learning ignore the generalization ability of the nonlinear representation of molecular features and the reasonable assignment of weights of molecular features, making it difficult to further improve the accuracy of molecular property prediction. To solve the above problems, an end-to-end double-head transformer neural network (DHTNN) is proposed in this paper for high-precision molecular property prediction. For the data distribution characteristics of the molecular dataset, DHTNN specially designs a new activation function, beaf, which can greatly improve the generalization ability of the nonlinear representation of molecular features. A residual network is introduced in the molecular encoding part to solve the gradient explosion problem and ensure that the model can converge quickly. The transformer based on double-head attention is used to extract molecular intrinsic detail features, and the weights are reasonably assigned for predicting molecular properties with high accuracy. Our model, which was tested on the MoleculeNet [1] benchmark dataset, showed significant performance improvements over other state-of-the-art methods.

Keywords: Deep learning; Molecular property prediction; Residual network; Transformer.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overall DHTNN architectural diagram. A High-precision nonlinear generalization representation of molecular features. B Molecular residual network encoding. C Molecular feature extraction of Transformer based on Double-head attention
Fig. 2
Fig. 2
Images of Tanh (a), ReLU (b), ELU (c), GeLU (d) and Beaf (e)
Fig. 3
Fig. 3
Diagram of the molecular residual network encoding framework. The framework contains a directed MPNN, a batch normalization layer, a molecular feed forward neural network, and a residual network
Fig. 4
Fig. 4
Molecular feature extraction of Transformer based on Double-head attention. a Molecular intrinsic detail feature extraction. b Layer structure for integrating intrinsic detail features. c Adjusting the data distribution before output
Fig. 5
Fig. 5
Performance of the model on Lipophilicity (a), PDBbind (b), PCBA (c), BACE (d), Tox21 (e) and SIDER (f) datasets. RMSE was calculated on Lipophilicity (a), PDBbind (b), the lower the RMSE, the better the model performance. PCBA (c), BACE (d), Tox21 (e), and SIDER (f) on which AUC was calculated; the higher the AUC, the better the model performance. Datasets were split by random
Fig. 6
Fig. 6
Performance of the model on Lipophilicity (a), PDBbind (b), PCBA (c), BACE (d), Tox21 (e) and SIDER (f) datasets. RMSE was calculated on Lipophilicity (a), PDBbind (b), the lower the RMSE, the better the model performance. PCBA (c), BACE (d), Tox21 (e), and SIDER (f) on which AUC was calculated; the higher the AUC, the better the model performance. Datasets were split by scaffold

Similar articles

Cited by

References

    1. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V. Moleculenet: a benchmark for molecular machine learning. Chem Sci. 2018;9(2):513–530. doi: 10.1039/C7SC02664A. - DOI - PMC - PubMed
    1. Li J, Jiang X. Mol-bert: an effective molecular representation with bert for molecular property prediction. Wirel Commun Mob Comput. 2021;2021:1–7. doi: 10.1155/2021/7181815. - DOI - PubMed
    1. Toussi CA, Haddadnia J, Matta CF. Drug design by machine-trained elastic networks: predicting ser/thr-protein kinase inhibitors’ activities. Mol Divers. 2021;25(2):899–909. doi: 10.1007/s11030-020-10074-6. - DOI - PubMed
    1. Cheng J, Zhang C, Dong L. A geometric-information-enhanced crystal graph network for predicting properties of materials. Commun Mater. 2021;2(1):1–11. doi: 10.1038/s43246-021-00194-3. - DOI
    1. Woo G, Fernandez M, Hsing M, Lack NA, Cavga AD, Cherkasov A. Deepcop: deep learning-based approach to predict gene regulating effects of small molecules. Bioinformatics. 2020;36(3):813–818. doi: 10.1093/bioinformatics/btz645. - DOI - PubMed

LinkOut - more resources

-