Machine learning reduced workload with minimal risk of missing studies: development and evaluation of a randomized controlled trial classifier for Cochrane Reviews
- PMID: 33171275
- PMCID: PMC8168828
- DOI: 10.1016/j.jclinepi.2020.11.003
Machine learning reduced workload with minimal risk of missing studies: development and evaluation of a randomized controlled trial classifier for Cochrane Reviews
Abstract
Objectives: This study developed, calibrated, and evaluated a machine learning classifier designed to reduce study identification workload in Cochrane for producing systematic reviews.
Methods: A machine learning classifier for retrieving randomized controlled trials (RCTs) was developed (the "Cochrane RCT Classifier"), with the algorithm trained using a data set of title-abstract records from Embase, manually labeled by the Cochrane Crowd. The classifier was then calibrated using a further data set of similar records manually labeled by the Clinical Hedges team, aiming for 99% recall. Finally, the recall of the calibrated classifier was evaluated using records of RCTs included in Cochrane Reviews that had abstracts of sufficient length to allow machine classification.
Results: The Cochrane RCT Classifier was trained using 280,620 records (20,454 of which reported RCTs). A classification threshold was set using 49,025 calibration records (1,587 of which reported RCTs), and our bootstrap validation found the classifier had recall of 0.99 (95% confidence interval 0.98-0.99) and precision of 0.08 (95% confidence interval 0.06-0.12) in this data set. The final, calibrated RCT classifier correctly retrieved 43,783 (99.5%) of 44,007 RCTs included in Cochrane Reviews but missed 224 (0.5%). Older records were more likely to be missed than those more recently published.
Conclusions: The Cochrane RCT Classifier can reduce manual study identification workload for Cochrane Reviews, with a very low and acceptable risk of missing eligible RCTs. This classifier now forms part of the Evidence Pipeline, an integrated workflow deployed within Cochrane to help improve the efficiency of the study identification processes that support systematic review production.
Keywords: Automation; Cochrane Library; Crowdsourcing; Information retrieval; Machine learning; Methods/methodology; Randomized controlled trials; Searching; Study classifiers; Systematic reviews.
Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.
Figures
Similar articles
-
Machine learning reduced workload for the Cochrane COVID-19 Study Register: development and evaluation of the Cochrane COVID-19 Study Classifier.Syst Rev. 2022 Jan 22;11(1):15. doi: 10.1186/s13643-021-01880-6. Syst Rev. 2022. PMID: 35065679 Free PMC article.
-
Cochrane Centralised Search Service showed high sensitivity identifying randomized controlled trials: A retrospective analysis.J Clin Epidemiol. 2020 Nov;127:142-150. doi: 10.1016/j.jclinepi.2020.08.008. Epub 2020 Aug 13. J Clin Epidemiol. 2020. PMID: 32798713
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
-
Search strategies to identify diagnostic accuracy studies in MEDLINE and EMBASE.Cochrane Database Syst Rev. 2013 Sep 11;2013(9):MR000022. doi: 10.1002/14651858.MR000022.pub3. Cochrane Database Syst Rev. 2013. PMID: 24022476 Free PMC article. Review.
-
Which resources should be used to identify RCT/CCTs for systematic reviews: a systematic review.BMC Med Res Methodol. 2005 Aug 10;5:24. doi: 10.1186/1471-2288-5-24. BMC Med Res Methodol. 2005. PMID: 16092960 Free PMC article. Review.
Cited by
-
Noise or sound management in the neonatal intensive care unit for preterm or very low birth weight infants.Cochrane Database Syst Rev. 2024 May 30;5(5):CD010333. doi: 10.1002/14651858.CD010333.pub4. Cochrane Database Syst Rev. 2024. PMID: 38813836 Review.
-
Nailing precision: a systematic review and meta-analysis of randomized controlled trials comparing piriformis and trochanteric entry points for femoral antegrade nailing.Arch Orthop Trauma Surg. 2024 Jun;144(6):2527-2538. doi: 10.1007/s00402-024-05359-6. Epub 2024 May 14. Arch Orthop Trauma Surg. 2024. PMID: 38744693
-
BioSift: A Dataset for Filtering Biomedical Abstracts for Drug Repurposing and Clinical Meta-Analysis.Int ACM SIGIR Conf Res Dev Inf Retr. 2023 Jul;2023:2913-2923. doi: 10.1145/3539618.3591897. Epub 2023 Jul 18. Int ACM SIGIR Conf Res Dev Inf Retr. 2023. PMID: 38690157 Free PMC article.
-
Value of preclinical systematic reviews and meta-analyses in pediatric research.Pediatr Res. 2024 Apr 13. doi: 10.1038/s41390-024-03197-1. Online ahead of print. Pediatr Res. 2024. PMID: 38615075 Review.
-
Syndesmotic screws, unscrew them, or leave them? A systematic review and meta-analysis of randomized controlled trials.J Orthop. 2024 Mar 22;54:136-142. doi: 10.1016/j.jor.2024.03.012. eCollection 2024 Aug. J Orthop. 2024. PMID: 38567192 Review.
References
-
- Cochrane Cochrane Library. 2019. https://www.cochranelibrary.com/ Available at.
-
- Lefebvre C., Glanville J., Briscoe S., Littlewood A., Marshall C., Metzendorf M. Chapter 4: searching for and selecting studies. In: Higgins J., Thomas J., Chandler J., Cumpston M., Li T., Page M., editors. Cochrane Handbook for Systematic Reviews of Interventions. 2nd ed. John Wiley & Sons; Chichester, UK: 2019. pp. 67–99.
-
- Shojania K.G., Sampson M., Ansari M.T., Ji J., Doucette S. How quickly do systematic reviews go out of date? A survival analysis. Ann Intern Med. 2007;147:224–233. - PubMed
-
- Macleod M.R., Michie S., Roberts I., Dirnagl U., Chalmers I., Ioannidis J.P.A. Biomedical research: increasing value, reducing waste. Lancet. 2014;383:101–104. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources