Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

doi:10.2196/52462

. 2024 Mar 22:8:e52462.

doi: 10.2196/52462.

Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

Hina Raja¹, Asim Munawar², Nikolaos Mylonas³, Mohammad Delsoz¹, Yeganeh Madadi¹, Muhammad Elahi⁴, Amr Hassan⁵, Hashem Abu Serhan⁶, Onur Inam^{7

8}, Luis Hernandez⁹, Hao Chen^{1

10}, Sang Tran¹¹, Wuqaas Munir¹¹, Alaa Abd-Alrazaq¹², Siamak Yousefi¹

Affiliations

¹ Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, TN, United States.
² Watson Research Center, IBM Research, New York, NY, United States.
³ School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece.
⁴ Quillen College of Medicine, East Tennessee State University, Johnson, TN, United States.
⁵ Gavin Herbert Eye Institute, School of Medicine, University of California, Irvine, CA, United States.
⁶ Department of Ophthalmology, Hamad Medical Corporation, Doha, Qatar.
⁷ Edward S. Harkness Eye Institute, Vagelos College of Physicians and Surgeons, Columbia University Irving Medical Center, New York, NY, United States.
⁸ Department of Biophysics, Faculty of Medicine, Gazi University, Ankara, Turkey.
⁹ Association to Prevent Blindness in Mexico, Ciudad, Mexico.
¹⁰ Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, United States.
¹¹ Department of Ophthalmology and Visual Sciences, School of Medicine, University of Maryland, Baltimore, MD, United States.
¹² AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar.

PMID: 38517457
PMCID: PMC10998173
DOI: 10.2196/52462

Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

Hina Raja et al. JMIR Form Res. 2024.

. 2024 Mar 22:8:e52462.

doi: 10.2196/52462.

Authors

Affiliations

¹ Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, TN, United States.
² Watson Research Center, IBM Research, New York, NY, United States.
³ School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece.
⁴ Quillen College of Medicine, East Tennessee State University, Johnson, TN, United States.
⁵ Gavin Herbert Eye Institute, School of Medicine, University of California, Irvine, CA, United States.
⁶ Department of Ophthalmology, Hamad Medical Corporation, Doha, Qatar.
⁷ Edward S. Harkness Eye Institute, Vagelos College of Physicians and Surgeons, Columbia University Irving Medical Center, New York, NY, United States.
⁸ Department of Biophysics, Faculty of Medicine, Gazi University, Ankara, Turkey.
⁹ Association to Prevent Blindness in Mexico, Ciudad, Mexico.
¹⁰ Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, United States.
¹¹ Department of Ophthalmology and Visual Sciences, School of Medicine, University of Maryland, Baltimore, MD, United States.
¹² AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar.

PMID: 38517457
PMCID: PMC10998173
DOI: 10.2196/52462

Abstract

Background: In this paper, we present an automated method for article classification, leveraging the power of large language models (LLMs).

Objective: The aim of this study is to evaluate the applicability of various LLMs based on textual content of scientific ophthalmology papers.

Methods: We developed a model based on natural language processing techniques, including advanced LLMs, to process and analyze the textual content of scientific papers. Specifically, we used zero-shot learning LLMs and compared Bidirectional and Auto-Regressive Transformers (BART) and its variants with Bidirectional Encoder Representations from Transformers (BERT) and its variants, such as distilBERT, SciBERT, PubmedBERT, and BioBERT. To evaluate the LLMs, we compiled a data set (retinal diseases [RenD] ) of 1000 ocular disease-related articles, which were expertly annotated by a panel of 6 specialists into 19 distinct categories. In addition to the classification of articles, we also performed analysis on different classified groups to find the patterns and trends in the field.

Results: The classification results demonstrate the effectiveness of LLMs in categorizing a large number of ophthalmology papers without human intervention. The model achieved a mean accuracy of 0.86 and a mean F₁-score of 0.85 based on the RenD data set.

Conclusions: The proposed framework achieves notable improvements in both accuracy and efficiency. Its application in the domain of ophthalmology showcases its potential for knowledge organization and retrieval. We performed a trend analysis that enables researchers and clinicians to easily categorize and retrieve relevant papers, saving time and effort in literature review and information gathering as well as identification of emerging scientific trends within different disciplines. Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.

Keywords: BART; BERT; Bidirectional and Auto-Regressive Transformers; LLM; bidirectional encoder representations from transformers; large language model; ophthalmology; text classification; trend analysis.

©Hina Raja, Asim Munawar, Nikolaos Mylonas, Mohammad Delsoz, Yeganeh Madadi, Muhammad Elahi, Amr Hassan, Hashem Abu Serhan, Onur Inam, Luis Hernandez, Hao Chen, Sang Tran, Wuqaas Munir, Alaa Abd-Alrazaq, Siamak Yousefi. Originally published in JMIR Formative Research (https://formative.jmir.org), 22.03.2024.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
Flow diagram of the proposed framework. The model takes input (keyword, inclusion criteria, and categories for classification), and articles are fetched from PubMed based on keyword. The inclusion criteria are fed into the preprocessing module to select the desired articles from the fetched data. A large language model classifies the articles based on the predefined categories. Finally, trend analysis is performed on classified categories.

**Figure 2**
Trend analysis of classified articles: (A) and (B) category-wise analysis for article type and ocular diseases group, respectively, and (C) timewise analysis for automated studies subclass group: image processing techniques, machine, and deep learning models. AMD: age-related macular degeneration; CSR: central serous retinopathy' DME: diabetic macular edema; DR: diabetic retinopathy.

See this image and copyright information in PMC

References

1. Santos ÁO, da Silva ES, Couto LM, Reis GV, Belo VS. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform. 2023 Jun;142:104389. doi: 10.1016/j.jbi.2023.104389.S1532-0464(23)00110-7 - DOI - PubMed
1. Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB. Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc. 2009 Jan 01;16(1):25–31. doi: 10.1197/jamia.m2996. - DOI - PMC - PubMed
1. Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Machine learning to increase the efficiency of a literature surveillance system: a performance evaluation. medRxiv. doi: 10.1101/2023.06.18.23291567. Preprint posted online June 19, 2023. https://www.medrxiv.org/content/10.1101/2023.06.18.23291567v1 - DOI - DOI
1. Hashimoto K, Kontonatsios G, Miwa M, Ananiadou S. Topic detection using paragraph vectors to support active learning in systematic reviews. J Biomed Inform. 2016 Aug;62:59–65. doi: 10.1016/j.jbi.2016.06.001. https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(16)30044-2 S1532-0464(16)30044-2 - DOI - PMC - PubMed
1. Kebede MM, Le Cornet C, Fortner RT. In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature. Res Synth Methods. 2023 Mar 23;14(2):156–72. doi: 10.1002/jrsm.1589. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- JMIR Publications
- PubMed Central

[1] Santos ÁO, da Silva ES, Couto LM, Reis GV, Belo VS. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform. 2023 Jun;142:104389. doi: 10.1016/j.jbi.2023.104389.S1532-0464(23)00110-7 - DOI - PubMed

[2] Santos ÁO, da Silva ES, Couto LM, Reis GV, Belo VS. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform. 2023 Jun;142:104389. doi: 10.1016/j.jbi.2023.104389.S1532-0464(23)00110-7 - DOI - PubMed

[3] Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB. Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc. 2009 Jan 01;16(1):25–31. doi: 10.1197/jamia.m2996. - DOI - PMC - PubMed

[4] Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB. Towards automatic recognition of scientifically rigorous clinical research evidence. J Am Med Inform Assoc. 2009 Jan 01;16(1):25–31. doi: 10.1197/jamia.m2996. - DOI - PMC - PubMed

[5] Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Machine learning to increase the efficiency of a literature surveillance system: a performance evaluation. medRxiv. doi: 10.1101/2023.06.18.23291567. Preprint posted online June 19, 2023. https://www.medrxiv.org/content/10.1101/2023.06.18.23291567v1 - DOI - DOI

[6] Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Machine learning to increase the efficiency of a literature surveillance system: a performance evaluation. medRxiv. doi: 10.1101/2023.06.18.23291567. Preprint posted online June 19, 2023. https://www.medrxiv.org/content/10.1101/2023.06.18.23291567v1 - DOI - DOI

[7] Hashimoto K, Kontonatsios G, Miwa M, Ananiadou S. Topic detection using paragraph vectors to support active learning in systematic reviews. J Biomed Inform. 2016 Aug;62:59–65. doi: 10.1016/j.jbi.2016.06.001. https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(16)30044-2 S1532-0464(16)30044-2 - DOI - PMC - PubMed

[8] Hashimoto K, Kontonatsios G, Miwa M, Ananiadou S. Topic detection using paragraph vectors to support active learning in systematic reviews. J Biomed Inform. 2016 Aug;62:59–65. doi: 10.1016/j.jbi.2016.06.001. https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(16)30044-2 S1532-0464(16)30044-2 - DOI - PMC - PubMed

[9] Kebede MM, Le Cornet C, Fortner RT. In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature. Res Synth Methods. 2023 Mar 23;14(2):156–72. doi: 10.1002/jrsm.1589. - DOI - PubMed

[10] Kebede MM, Le Cornet C, Fortner RT. In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature. Res Synth Methods. 2023 Mar 23;14(2):156–72. doi: 10.1002/jrsm.1589. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

Affiliations

Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources