Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 1:2019:baz095.
doi: 10.1093/database/baz095.

Extraction of chemical-protein interactions from the literature using neural networks and narrow instance representation

Affiliations

Extraction of chemical-protein interactions from the literature using neural networks and narrow instance representation

Rui Antunes et al. Database (Oxford). .

Abstract

The scientific literature contains large amounts of information on genes, proteins, chemicals and their interactions. Extraction and integration of this information in curated knowledge bases help researchers support their experimental results, leading to new hypotheses and discoveries. This is especially relevant for precision medicine, which aims to understand the individual variability across patient groups in order to select the most appropriate treatments. Methods for improved retrieval and automatic relation extraction from biomedical literature are therefore required for collecting structured information from the growing number of published works. In this paper, we follow a deep learning approach for extracting mentions of chemical-protein interactions from biomedical articles, based on various enhancements over our participation in the BioCreative VI CHEMPROT task. A significant aspect of our best method is the use of a simple deep learning model together with a very narrow representation of the relation instances, using only up to 10 words from the shortest dependency path and the respective dependency edges. Bidirectional long short-term memory recurrent networks or convolutional neural networks are used to build the deep learning models. We report the results of several experiments and show that our best model is competitive with more complex sentence representations or network structures, achieving an F1-score of 0.6306 on the test set. The source code of our work, along with detailed statistics, is publicly available.

PubMed Disclaimer

Figures

<sc>Figure</sc> 1
Figure 1
Example sentence illustrating biochemical entities and their relations.
<sc>Figure</sc> 2
Figure 2
Example illustrating the dependency structure of a sentence from the CHEMPROT training dataset (PMID 10340919). In this example, we considered the relation between the ‘meloxicam’ chemical mention and the ‘COX’ protein mention. The SDP is highlighted in bold and blue color.
<sc>Figure</sc> 3
Figure 3
Neural network structure.

Similar articles

Cited by

References

    1. Wu P.-Y., Cheng C.-W., Kaddi C.D., et al. . Omic and electronic health record big data analytics for precision medicine. IEEE Trans. Biomed. Eng., 64:263–273, 2017. - PMC - PubMed
    1. Wang Q., Abdul S.S., Almeida L. et al. (2016) Overview of the interactive task in BioCreative V. Database, 2016, baw119. - PMC - PubMed
    1. Campos D., Matos S., and Oliveira J.L.. A modular framework for biomedical concept recognition. BMC Bioinform., 14:281, 2013. - PMC - PubMed
    1. Nunes T., Campos D., Matos S., et al. . BeCAS: biomedical concept recognition services and visualization. Bioinformatics, 29:1915, 2013. - PubMed
    1. Ananiadou S., Thompson P., Nawaz R., et al. . Event-based text mining for biology and functional genomics. Brief. Funct. Genomics, 14:213–230, 2015. - PMC - PubMed

Publication types

-