PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval
- PMID: 18538027
- PMCID: PMC2442104
- DOI: 10.1186/1471-2105-9-270
PageRank without hyperlinks: reranking with PubMed related article networks for biomedical text retrieval
Abstract
Background: Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed(R) search interface, a MEDLINE(R) citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web.
Results: We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics.
Conclusion: The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-1.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-2.gif)
![Figure 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-3.gif)
![Figure 4](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-4.gif)
![Figure 5](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-5.gif)
![Figure 6](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-6.gif)
![Figure 7](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-7.gif)
![Figure 8](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-8.gif)
![Figure 9](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-9.gif)
![Figure 10](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/2442104/bin/1471-2105-9-270-10.gif)
Similar articles
-
Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.BMC Bioinformatics. 2008 Feb 29;9:132. doi: 10.1186/1471-2105-9-132. BMC Bioinformatics. 2008. PMID: 18312673 Free PMC article.
-
Text similarity: an alternative way to search MEDLINE.Bioinformatics. 2006 Sep 15;22(18):2298-304. doi: 10.1093/bioinformatics/btl388. Epub 2006 Aug 22. Bioinformatics. 2006. PMID: 16926219
-
Investigation into biomedical literature classification using support vector machines.Proc IEEE Comput Syst Bioinform Conf. 2005:366-74. doi: 10.1109/csb.2005.36. Proc IEEE Comput Syst Bioinform Conf. 2005. PMID: 16447994
-
Hairpins in bookstacks: information retrieval from biomedical text.Brief Bioinform. 2005 Sep;6(3):222-38. doi: 10.1093/bib/6.3.222. Brief Bioinform. 2005. PMID: 16212771 Review.
-
A survey of current work in biomedical text mining.Brief Bioinform. 2005 Mar;6(1):57-71. doi: 10.1093/bib/6.1.57. Brief Bioinform. 2005. PMID: 15826357 Review.
Cited by
-
Information Retrieval and Graph Analysis Approaches for Book Recommendation.ScientificWorldJournal. 2015;2015:926418. doi: 10.1155/2015/926418. Epub 2015 Sep 30. ScientificWorldJournal. 2015. PMID: 26504899 Free PMC article.
-
Accessing biomedical literature in the current information landscape.Methods Mol Biol. 2014;1159:11-31. doi: 10.1007/978-1-4939-0709-0_2. Methods Mol Biol. 2014. PMID: 24788259 Free PMC article. Review.
References
-
- Page L, Brin S, Motwani R, Winograd T. Stanford Digital Library Working Paper SIDL-WP-1999-0120. Stanford University; 1999. The PageRank Citation Ranking: Bringing Order to the Web.
-
- Kleinberg JM. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM. 1999;46:604–632.
-
- Hersh WR, Cohen A, Yang J, Bhupatiraju R, Roberts P, Hearst M. TREC 2005 Genomics Track Overview. Proceedings of the Fourteenth Text REtrieval Conference (TREC 2005), Gaithersburg, Maryland. 2005.
-
- Amati G, van Rijsbergen CJ. Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems. 2002;20:357–389.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources