Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 17;25(1):32.
doi: 10.1186/s12859-024-05649-1.

MSCAN: multi-scale self- and cross-attention network for RNA methylation site prediction

Affiliations

MSCAN: multi-scale self- and cross-attention network for RNA methylation site prediction

Honglei Wang et al. BMC Bioinformatics. .

Abstract

Background: Epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all RNA types. Precise recognition of RNA modifications is critical for understanding their functions and regulatory mechanisms. However, wet experimental methods are often costly and time-consuming, limiting their wide range of applications. Therefore, recent research has focused on developing computational methods, particularly deep learning (DL). Bidirectional long short-term memory (BiLSTM), convolutional neural network (CNN), and the transformer have demonstrated achievements in modification site prediction. However, BiLSTM cannot achieve parallel computation, leading to a long training time, CNN cannot learn the dependencies of the long distance of the sequence, and the Transformer lacks information interaction with sequences at different scales. This insight underscores the necessity for continued research and development in natural language processing (NLP) and DL to devise an enhanced prediction framework that can effectively address the challenges presented.

Results: This study presents a multi-scale self- and cross-attention network (MSCAN) to identify the RNA methylation site using an NLP and DL way. Experiment results on twelve RNA modification sites (m6A, m1A, m5C, m5U, m6Am, m7G, Ψ, I, Am, Cm, Gm, and Um) reveal that the area under the receiver operating characteristic of MSCAN obtains respectively 98.34%, 85.41%, 97.29%, 96.74%, 99.04%, 79.94%, 76.22%, 65.69%, 92.92%, 92.03%, 95.77%, 89.66%, which is better than the state-of-the-art prediction model. This indicates that the model has strong generalization capabilities. Furthermore, MSCAN reveals a strong association among different types of RNA modifications from an experimental perspective. A user-friendly web server for predicting twelve widely occurring human RNA modification sites (m6A, m1A, m5C, m5U, m6Am, m7G, Ψ, I, Am, Cm, Gm, and Um) is available at http://47.242.23.141/MSCAN/index.php .

Conclusions: A predictor framework has been developed through binary classification to predict RNA methylation sites.

Keywords: Cross-attention; Multi-scale; Predictor; RNA methylation; Self-attention; Transformer.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Performance of the MSCAN model based on the different feature encoding
Fig. 2
Fig. 2
Performance of MSCAN and variant model on the test data
Fig. 3
Fig. 3
Performance of the different models on the training data
Fig. 4
Fig. 4
The ROC and PRC of MSCAN and other state-of-the-art models on the test data
Fig. 5
Fig. 5
Heat map of different AUROC values in cross-methylation validation. The horizontal axis is the model type, and the vertical axis is the test data type
Fig. 6
Fig. 6
Webserver interface. a. Input interface. b. Prediction result
Fig. 7
Fig. 7
The motif of methylation sites. a m1A in the dataset of Chen et al. b m6A. c Ψ. d m1A. e m6Am. f Am. g Cm. h Gm. i Um. j m5C. k m5U. l m7G. m I in the dataset of Song et al.
Fig. 7
Fig. 7
The motif of methylation sites. a m1A in the dataset of Chen et al. b m6A. c Ψ. d m1A. e m6Am. f Am. g Cm. h Gm. i Um. j m5C. k m5U. l m7G. m I in the dataset of Song et al.
Fig. 8
Fig. 8
Structure of our computational framework based on multi-scale self- and cross-attention network to predict m1A methylation site
Fig. 9
Fig. 9
Schematic diagram of the obtained subsequences
Fig. 10
Fig. 10
The internal structure of the multi-scale self- and cross-attention network

Similar articles

Cited by

References

    1. El Allali A, Elhamraoui Z, Daoud R. Machine learning applications in RNA modification sites prediction. Comput Struct Biotechnol J. 2021;19:5510–5524. - PMC - PubMed
    1. Wang H, Wang SY, Zhang Y, Bi SD, Zhu XL. A brief review of machine learning methods for RNA methylation sites prediction. Methods. 2022;203:399–421. - PubMed
    1. Liu L, Song B, Ma J, Song Y, Meng J. Bioinformatics approaches for deciphering the epitranscriptome: recent progress and emerging topics. Comput Struct Biotechnol J. 2020;18:1587–1604. - PMC - PubMed
    1. Chen LF, Tan XQ, Wang DY, Zhong FS, Liu XH, Yang TB, Luo XM, Chen KX, Jiang HL, Zheng MY. TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics. 2020;36(16):4406–4414. - PubMed
    1. Song ZT, Huang DY, Song BW, Chen KQ, Song YY, Liu G, Su JL, de Magalhaes JP, Rigden DJ, Meng J. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat Commun. 2021;12(1):1–11. - PMC - PubMed

LinkOut - more resources

-