Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study
- PMID: 38517457
- PMCID: PMC10998173
- DOI: 10.2196/52462
Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study
Abstract
Background: In this paper, we present an automated method for article classification, leveraging the power of large language models (LLMs).
Objective: The aim of this study is to evaluate the applicability of various LLMs based on textual content of scientific ophthalmology papers.
Methods: We developed a model based on natural language processing techniques, including advanced LLMs, to process and analyze the textual content of scientific papers. Specifically, we used zero-shot learning LLMs and compared Bidirectional and Auto-Regressive Transformers (BART) and its variants with Bidirectional Encoder Representations from Transformers (BERT) and its variants, such as distilBERT, SciBERT, PubmedBERT, and BioBERT. To evaluate the LLMs, we compiled a data set (retinal diseases [RenD] ) of 1000 ocular disease-related articles, which were expertly annotated by a panel of 6 specialists into 19 distinct categories. In addition to the classification of articles, we also performed analysis on different classified groups to find the patterns and trends in the field.
Results: The classification results demonstrate the effectiveness of LLMs in categorizing a large number of ophthalmology papers without human intervention. The model achieved a mean accuracy of 0.86 and a mean F1-score of 0.85 based on the RenD data set.
Conclusions: The proposed framework achieves notable improvements in both accuracy and efficiency. Its application in the domain of ophthalmology showcases its potential for knowledge organization and retrieval. We performed a trend analysis that enables researchers and clinicians to easily categorize and retrieve relevant papers, saving time and effort in literature review and information gathering as well as identification of emerging scientific trends within different disciplines. Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.
Keywords: BART; BERT; Bidirectional and Auto-Regressive Transformers; LLM; bidirectional encoder representations from transformers; large language model; ophthalmology; text classification; trend analysis.
©Hina Raja, Asim Munawar, Nikolaos Mylonas, Mohammad Delsoz, Yeganeh Madadi, Muhammad Elahi, Amr Hassan, Hashem Abu Serhan, Onur Inam, Luis Hernandez, Hao Chen, Sang Tran, Wuqaas Munir, Alaa Abd-Alrazaq, Siamak Yousefi. Originally published in JMIR Formative Research (https://formative.jmir.org), 22.03.2024.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10998173/bin/formative_v8i1e52462_fig1.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/10998173/bin/formative_v8i1e52462_fig2.gif)
Similar articles
-
Applications of Large Language Models in Pathology.Bioengineering (Basel). 2024 Mar 31;11(4):342. doi: 10.3390/bioengineering11040342. Bioengineering (Basel). 2024. PMID: 38671764 Free PMC article. Review.
-
Investigating the Impact of Prompt Engineering on the Performance of Large Language Models for Standardizing Obstetric Diagnosis Text: Comparative Study.JMIR Form Res. 2024 Feb 8;8:e53216. doi: 10.2196/53216. JMIR Form Res. 2024. PMID: 38329787 Free PMC article.
-
Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications.Exp Biol Med (Maywood). 2023 Nov;248(21):1908-1917. doi: 10.1177/15353702231215895. Epub 2023 Dec 12. Exp Biol Med (Maywood). 2023. PMID: 38084745 Free PMC article.
-
Automated Recognition of Visual Acuity Measurements in Ophthalmology Clinical Notes Using Deep Learning.Ophthalmol Sci. 2023 Jul 19;4(2):100371. doi: 10.1016/j.xops.2023.100371. eCollection 2024 Mar-Apr. Ophthalmol Sci. 2023. PMID: 37868799 Free PMC article.
-
Large Language Models in Ophthalmology Scientific Writing: Ethical Considerations Blurred Lines or Not at All?Am J Ophthalmol. 2023 Oct;254:177-181. doi: 10.1016/j.ajo.2023.06.004. Epub 2023 Jun 20. Am J Ophthalmol. 2023. PMID: 37348667 Review.
References
-
- Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Machine learning to increase the efficiency of a literature surveillance system: a performance evaluation. medRxiv. doi: 10.1101/2023.06.18.23291567. Preprint posted online June 19, 2023. https://www.medrxiv.org/content/10.1101/2023.06.18.23291567v1 - DOI - DOI
-
- Hashimoto K, Kontonatsios G, Miwa M, Ananiadou S. Topic detection using paragraph vectors to support active learning in systematic reviews. J Biomed Inform. 2016 Aug;62:59–65. doi: 10.1016/j.jbi.2016.06.001. https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(16)30044-2 S1532-0464(16)30044-2 - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources