Construction of environmental risk score beyond standard linear models using machine learning methods: application to metal mixtures, oxidative stress and cardiovascular disease in NHANES
- PMID: 28950902
- PMCID: PMC5615812
- DOI: 10.1186/s12940-017-0310-9
Construction of environmental risk score beyond standard linear models using machine learning methods: application to metal mixtures, oxidative stress and cardiovascular disease in NHANES
Abstract
Background: There is growing concern of health effects of exposure to pollutant mixtures. We initially proposed an Environmental Risk Score (ERS) as a summary measure to examine the risk of exposure to multi-pollutants in epidemiologic research considering only pollutant main effects. We expand the ERS by consideration of pollutant-pollutant interactions using modern machine learning methods. We illustrate the multi-pollutant approaches to predicting a marker of oxidative stress (gamma-glutamyl transferase (GGT)), a common disease pathway linking environmental exposure and numerous health endpoints.
Methods: We examined 20 metal biomarkers measured in urine or whole blood from 6 cycles of the National Health and Nutrition Examination Survey (NHANES 2003-2004 to 2013-2014, n = 9664). We randomly split the data evenly into training and testing sets and constructed ERS's of metal mixtures for GGT using adaptive elastic-net with main effects and pairwise interactions (AENET-I), Bayesian additive regression tree (BART), Bayesian kernel machine regression (BKMR), and Super Learner in the training set and evaluated their performances in the testing set. We also evaluated the associations between GGT-ERS and cardiovascular endpoints.
Results: ERS based on AENET-I performed better than other approaches in terms of prediction errors in the testing set. Important metals identified in relation to GGT include cadmium (urine), dimethylarsonic acid, monomethylarsonic acid, cobalt, and barium. All ERS's showed significant associations with systolic and diastolic blood pressure and hypertension. For hypertension, one SD increase in each ERS from AENET-I, BART and SuperLearner were associated with odds ratios of 1.26 (95% CI, 1.15, 1.38), 1.17 (1.09, 1.25), and 1.30 (1.20, 1.40), respectively. ERS's showed non-significant positive associations with mortality outcomes.
Conclusions: ERS is a useful tool for characterizing cumulative risk from pollutant mixtures, with accounting for statistical challenges such as high degrees of correlations and pollutant-pollutant interactions. ERS constructed for an intermediate marker like GGT is predictive of related disease endpoints.
Keywords: Bayesian additive regression tree (BART); Bayesian kernel machine regression (BKMR); Cardiovascular disease; Elastic-net; Environmental risk score (ERS); Machine learning; Metals; Mixtures; Multipollutants; Super Learner.
Conflict of interest statement
Ethics approval and consent to participate
NHANES is a publicly available data set and all participants in NHANES provide written informed consent, consistent with approval by the National Center for Health Statistics Institutional Review Board.
Consent for publication
Not Applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures
Similar articles
-
Associations of cumulative exposure to heavy metal mixtures with obesity and its comorbidities among U.S. adults in NHANES 2003-2014.Environ Int. 2018 Dec;121(Pt 1):683-694. doi: 10.1016/j.envint.2018.09.035. Epub 2018 Oct 11. Environ Int. 2018. PMID: 30316184 Free PMC article.
-
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.Res Rep Health Eff Inst. 2015 Jun;(183 Pt 1-2):5-50. Res Rep Health Eff Inst. 2015. PMID: 26333238
-
Environmental risk score as a new tool to examine multi-pollutants in epidemiologic research: an example from the NHANES study using serum lipid levels.PLoS One. 2014 Jun 5;9(6):e98632. doi: 10.1371/journal.pone.0098632. eCollection 2014. PLoS One. 2014. PMID: 24901996 Free PMC article.
-
Exposure to Metal Mixtures in Association with Cardiovascular Risk Factors and Outcomes: A Scoping Review.Toxics. 2022 Mar 1;10(3):116. doi: 10.3390/toxics10030116. Toxics. 2022. PMID: 35324741 Free PMC article. Review.
-
A review of practical statistical methods used in epidemiological studies to estimate the health effects of multi-pollutant mixture.Environ Pollut. 2022 Aug 1;306:119356. doi: 10.1016/j.envpol.2022.119356. Epub 2022 Apr 27. Environ Pollut. 2022. PMID: 35487468 Review.
Cited by
-
Environmental risk score of multiple pollutants for kidney damage among residents in vulnerable areas by occupational chemical exposure in Korea.Environ Sci Pollut Res Int. 2024 May;31(24):35938-35951. doi: 10.1007/s11356-024-33567-5. Epub 2024 May 14. Environ Sci Pollut Res Int. 2024. PMID: 38743333 Free PMC article.
-
Mixture effects of trace element levels on cardiovascular diseases and type 2 diabetes risk in adults using G-computation analysis.Sci Rep. 2024 Mar 8;14(1):5743. doi: 10.1038/s41598-024-56468-6. Sci Rep. 2024. PMID: 38459117 Free PMC article.
-
Association of chronic cough with exposure to polycyclic aromatic hydrocarbons in the US population.Heliyon. 2023 Dec 7;10(1):e23413. doi: 10.1016/j.heliyon.2023.e23413. eCollection 2024 Jan 15. Heliyon. 2023. PMID: 38173475 Free PMC article.
-
Associations of phthalates, phthalate replacements, and their mixtures with eicosanoid biomarkers during pregnancy.Environ Int. 2023 Aug;178:108101. doi: 10.1016/j.envint.2023.108101. Epub 2023 Jul 20. Environ Int. 2023. PMID: 37487376 Free PMC article.
-
The role of exposure to per- and polyfluoroalkyl substances in racial/ethnic disparities in hypertension: Results from the study of Women's health across the nation.Environ Res. 2023 Jun 15;227:115813. doi: 10.1016/j.envres.2023.115813. Epub 2023 Mar 31. Environ Res. 2023. PMID: 37004857 Free PMC article.
References
-
- Sun Z, Tao Y, Li S, Ferguson KK, Meeker JD, Park SK, Batterman SA, Mukherjee B. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons. Environ Health. 2013;12(1):85. doi: 10.1186/1476-069X-12-85. - DOI - PMC - PubMed
-
- Tibshirani R. Regression Shrinkage and Selection via the Lasso. J R Stat Soc Ser B Methodol. 1996;58(1):267–288.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous