TEES 2.2: Biomedical Event Extraction for Diverse Corpora
- PMID: 26551925
- PMCID: PMC4642046
- DOI: 10.1186/1471-2105-16-S16-S4
TEES 2.2: Biomedical Event Extraction for Diverse Corpora
Abstract
Background: The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks.
Results: The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets.
Conclusions: The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-1.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-2.gif)
![Figure 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-3.gif)
![Figure 4](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-4.gif)
![Figure 5](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-5.gif)
![Figure 6](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4642046/bin/1471-2105-16-S16-S4-6.gif)
Similar articles
-
Text mining approaches for dealing with the rapidly expanding literature on COVID-19.Brief Bioinform. 2021 Mar 22;22(2):781-799. doi: 10.1093/bib/bbaa296. Brief Bioinform. 2021. PMID: 33279995 Free PMC article. Review.
-
Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.J Biomed Semantics. 2016 Apr 27;7:22. doi: 10.1186/s13326-016-0059-z. eCollection 2016. J Biomed Semantics. 2016. PMID: 27127603 Free PMC article.
-
A novel feature selection strategy for enhanced biomedical event extraction using the Turku system.Biomed Res Int. 2014;2014:205239. doi: 10.1155/2014/205239. Epub 2014 Apr 6. Biomed Res Int. 2014. PMID: 24800214 Free PMC article.
-
A survey on annotation tools for the biomedical literature.Brief Bioinform. 2014 Mar;15(2):327-40. doi: 10.1093/bib/bbs084. Epub 2012 Dec 18. Brief Bioinform. 2014. PMID: 23255168 Review.
-
University of Turku in the BioNLP'11 Shared Task.BMC Bioinformatics. 2012 Jun 26;13 Suppl 11(Suppl 11):S4. doi: 10.1186/1471-2105-13-S11-S4. BMC Bioinformatics. 2012. PMID: 22759458 Free PMC article.
Cited by
-
A biomedical event extraction method based on fine-grained and attention mechanism.BMC Bioinformatics. 2022 Jul 29;23(1):308. doi: 10.1186/s12859-022-04854-0. BMC Bioinformatics. 2022. PMID: 35906547 Free PMC article.
-
Biomedical event extraction with a novel combination strategy based on hybrid deep neural networks.BMC Bioinformatics. 2020 Feb 6;21(1):47. doi: 10.1186/s12859-020-3376-2. BMC Bioinformatics. 2020. PMID: 32028883 Free PMC article.
-
Extraction of chemical-protein interactions from the literature using neural networks and narrow instance representation.Database (Oxford). 2019 Jan 1;2019:baz095. doi: 10.1093/database/baz095. Database (Oxford). 2019. PMID: 31622463 Free PMC article.
-
COPIOUS: A gold standard corpus of named entities towards extracting species occurrence from biodiversity literature.Biodivers Data J. 2019 Jan 22;(7):e29626. doi: 10.3897/BDJ.7.e29626. eCollection 2019. Biodivers Data J. 2019. PMID: 30700967 Free PMC article.
-
Annotation and detection of drug effects in text for pharmacovigilance.J Cheminform. 2018 Aug 13;10(1):37. doi: 10.1186/s13321-018-0290-y. J Cheminform. 2018. PMID: 30105604 Free PMC article.
References
-
- Kim JD, Ohta T, Pyysalo S, Kano Y, Tsujii J. Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task. ACL, Boulder, Colorado; 2009. Overview of BioNLP'09 Shared Task on Event Extraction; pp. 1–9.
-
- Kim JD, Pyysalo S, Ohta T, Bossy R, Tsujii J. Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task. Association for Computational Linguistics, Portland, Oregon; 2011. Overview of BioNLP Shared Task 2011.
-
- Nédellec C, Bossy R, Kim JD, Kim JJ, Ohta T, Pyysalo S, Zweigenbaum P. Proceedings of the BioNLP Shared Task 2013 Workshop. Association for Computational Linguistics, Sofia, Bulgaria; 2013. Overview of bionlp shared task 2013; pp. 1–7.
-
- Björne J, Heimonen J, Ginter F, Airola A, Pahikkala T, Salakoski T. Extracting Contextualized Complex Biological Events with Rich Graph-Based Feature Sets. Computational Intelligence, Special issue on Extracting Bio-molecular Events from Literature. 2011. Accepted in 2009.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous