A review of machine learning methods for cancer characterization from microbiome data
- PMID: 38816569
- PMCID: PMC11139966
- DOI: 10.1038/s41698-024-00617-7
A review of machine learning methods for cancer characterization from microbiome data
Abstract
Recent studies have shown that the microbiome can impact cancer development, progression, and response to therapies suggesting microbiome-based approaches for cancer characterization. As cancer-related signatures are complex and implicate many taxa, their discovery often requires Machine Learning approaches. This review discusses Machine Learning methods for cancer characterization from microbiome data. It focuses on the implications of choices undertaken during sample collection, feature selection and pre-processing. It also discusses ML model selection, guiding how to choose an ML model, and model validation. Finally, it enumerates current limitations and how these may be surpassed. Proposed methods, often based on Random Forests, show promising results, however insufficient for widespread clinical usage. Studies often report conflicting results mainly due to ML models with poor generalizability. We expect that evaluating models with expanded, hold-out datasets, removing technical artifacts, exploring representations of the microbiome other than taxonomical profiles, leveraging advances in deep learning, and developing ML models better adapted to the characteristics of microbiome data will improve the performance and generalizability of models and enable their usage in the clinic.
© 2024. The Author(s).
Conflict of interest statement
R.M.F. and C.F. own patent WO/2018/169423 on microbiome markers for gastric cancer. The remaining authors declare no competing interests.
Figures
Similar articles
-
A toolbox of machine learning software to support microbiome analysis.Front Microbiol. 2023 Nov 22;14:1250806. doi: 10.3389/fmicb.2023.1250806. eCollection 2023. Front Microbiol. 2023. PMID: 38075858 Free PMC article. Review.
-
Gene-based microbiome representation enhances host phenotype classification.mSystems. 2023 Aug 31;8(4):e0053123. doi: 10.1128/msystems.00531-23. Epub 2023 Jul 5. mSystems. 2023. PMID: 37404032 Free PMC article.
-
Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations.Bioengineering (Basel). 2023 Feb 8;10(2):231. doi: 10.3390/bioengineering10020231. Bioengineering (Basel). 2023. PMID: 36829725 Free PMC article.
-
Benchmark of Data Processing Methods and Machine Learning Models for Gut Microbiome-Based Diagnosis of Inflammatory Bowel Disease.Front Genet. 2022 Feb 14;13:784397. doi: 10.3389/fgene.2022.784397. eCollection 2022. Front Genet. 2022. PMID: 35251123 Free PMC article.
-
Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment.Front Microbiol. 2021 Feb 19;12:634511. doi: 10.3389/fmicb.2021.634511. eCollection 2021. Front Microbiol. 2021. PMID: 33737920 Free PMC article. Review.
References
-
- WHO. WHO Methods and Data Sources for Country-Level Causes of Death: 2000-2019 (World Health Organization, 2020).
-
- Hanahan D. Hallmarks of cancer: new dimensions. Cancer Discov. 2022;12:31–46. doi: 10.1158/2159-8290.CD-21-1059. - DOI - PubMed
Publication types
LinkOut - more resources
Full Text Sources
Research Materials