Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 2;5(12):e23440.
doi: 10.2196/23440.

Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models

Affiliations

Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models

Eman M Alanazi et al. JMIR Form Res. .

Abstract

Background: Stroke, a cerebrovascular disease, is one of the major causes of death. It causes significant health and financial burdens for both patients and health care systems. One of the important risk factors for stroke is health-related behavior, which is becoming an increasingly important focus of prevention. Many machine learning models have been built to predict the risk of stroke or to automatically diagnose stroke, using predictors such as lifestyle factors or radiological imaging. However, there have been no models built using data from lab tests.

Objective: The aim of this study was to apply computational methods using machine learning techniques to predict stroke from lab test data.

Methods: We used the National Health and Nutrition Examination Survey data sets with three different data selection methods (ie, without data resampling, with data imputation, and with data resampling) to develop predictive models. We used four machine learning classifiers and six performance measures to evaluate the performance of the models.

Results: We found that accurate and sensitive machine learning models can be created to predict stroke from lab test data. Our results show that the data resampling approach performed the best compared to the other two data selection techniques. Prediction with the random forest algorithm, which was the best algorithm tested, achieved an accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve of 0.96, 0.97, 0.96, 0.75, 0.99, and 0.97, respectively, when all of the attributes were used.

Conclusions: The predictive model, built using data from lab tests, was easy to use and had high accuracy. In future studies, we aim to use data that reflect different types of stroke and to explore the data to build a prediction model for each type.

Keywords: lab tests; machine learning technology; predictive analytics; stroke.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Flow diagram of the study methodology. NHANES: National Health and Nutrition Examination Survey.
Figure 2
Figure 2
Participant selection and prevalence of stroke in the National Health and Nutrition Examination Survey (NHANES).
Figure 3
Figure 3
Performance comparison among three data selection techniques for the decision tree model. AUC: area under the curve; NPV: negative predictive value; PPV: positive predictive value.
Figure 4
Figure 4
Performance comparison among three data selection techniques for the random forest model. AUC: area under the curve; NPV: negative predictive value; PPV: positive predictive value.

Similar articles

Cited by

References

    1. Sacco RL, Kasner SE, Broderick JP, Caplan LR, Connors JJB, Culebras A, Elkind MSV, George MG, Hamdan AD, Higashida RT, Hoh BL, Janis LS, Kase CS, Kleindorfer DO, Lee JM, Moseley ME, Peterson ED, Turan TN, Valderrama AL, Vinters HV, American Heart Association Stroke Council‚ Council on Cardiovascular Surgery and Anesthesia. Council on Cardiovascular Radiology and Intervention. Council on Cardiovascular and Stroke Nursing. Council on Epidemiology and Prevention. Council on Peripheral Vascular Disease. Council on Nutrition‚ Physical Activity and Metabolism An updated definition of stroke for the 21st century: A statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2013 Jul;44(7):2064–2089. doi: 10.1161/STR.0b013e318296aeca.STR.0b013e318296aeca - DOI - PubMed
    1. Benjamin EJ, Muntner P, Alonso A, Bittencourt MS, Callaway CW, Carson AP, Chamberlain AM, Chang AR, Cheng S, Das SR, Delling FN, Djousse L, Elkind MSV, Ferguson JF, Fornage M, Jordan LC, Khan SS, Kissela BM, Knutson KL, Kwan TW, Lackland DT, Lewis TT, Lichtman JH, Longenecker CT, Loop MShane, Lutsey PL, Martin SS, Matsushita K, Moran AE, Mussolino ME, O'Flaherty M, Pandey A, Perak AM, Rosamond WD, Roth GA, Sampson UKA, Satou GM, Schroeder EB, Shah SH, Spartano NL, Stokes A, Tirschwell DL, Tsao CW, Turakhia MP, VanWagner LB, Wilkins JT, Wong SS, Virani SS, American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee Heart Disease and Stroke Statistics-2019 Update: A report from the American Heart Association. Circulation. 2019 Mar 05;139(10):e56–e528. doi: 10.1161/CIR.0000000000000659. https://www.ahajournals.org/doi/abs/10.1161/CIR.0000000000000659?url_ver... - DOI - DOI - PubMed
    1. European Stroke Initiative Executive Committee. EUSI Writing Committee. Olsen TS, Langhorne P, Diener HC, Hennerici M, Ferro J, Sivenius J, Wahlgren NG, Bath P. European Stroke Initiative Recommendations for Stroke Management – Update 2003. Cerebrovasc Dis. 2003;16(4):311–337. doi: 10.1159/000072554. https://www.karger.com?DOI=10.1159/000072554 - DOI - PubMed
    1. Boden-Albala B, Sacco RL. Lifestyle factors and stroke risk: Exercise, alcohol, diet, obesity, smoking, drug use, and stress. Curr Atheroscler Rep. 2000 Mar;2(2):160–166. doi: 10.1007/s11883-000-0111-3. - DOI - PubMed
    1. Arnett D, Blumenthal R, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, Himmelfarb CD, Khera A, Lloyd-Jones D, McEvoy JW, Michos ED, Miedema MD, Muñoz D, Smith SC, Virani SS, Williams KA, Yeboah J, Ziaeian B. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease: A report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. J Am Coll Cardiol. 2019 Sep 10;74(10):e177–e232. doi: 10.1016/j.jacc.2019.03.010.S0735-1097(19)33877-X - DOI - PMC - PubMed
-