Estimating overannotation across prokaryotic genomes using BLAST+, UBLAST, LAST and BLAT
- PMID: 25228073
- PMCID: PMC4180129
- DOI: 10.1186/1756-0500-7-651
Estimating overannotation across prokaryotic genomes using BLAST+, UBLAST, LAST and BLAT
Abstract
Background: As the number of genomes in public databases increases, it becomes more important to be able to quickly choose the best annotated genomes for further analyses in comparative genomics and evolution. A proxy to annotation quality is the estimation of overannotation by comparing annotated coding genes against the SwissProt database. NCBI's BLAST (BLAST+) is the common software of choice to compare these sequences. Newer programs that run in a fraction of the time as BLAST+ might miss matches that BLAST+ would find. However, the results might still be useful to calculate overannotation. We thus decided to compare the overannotation estimates yielded using three such programs, UBLAST, LAST and the Blast-Like Alignment Tool (BLAT), and to test non-redundant versions of the SwissProt database to reduce the number of comparisons necessary.
Findings: We found that all, UBLAST, LAST and BLAT, tend to produce similar overannotation estimates to those obtained with BLAST+. As would be expected, results varied the most from those obtained with BLAST+ in genomes with fewer proteins matching sequences in the SwissProt database. UBLAST was the fastest running algorithm, and showed the smallest variation from the results obtained using BLAST+. Reduced SwissProt databases did not seem to affect the results much, but the reduction in time was modest compared to that obtained from UBLAST, LAST, or BLAT.
Conclusions: Despite faster programs miss sequence matches otherwise found by NCBI's BLAST, the overannotation estimates are very similar and thus these programs can be used with confidence for this task.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4180129/bin/13104_2013_3190_Fig1_HTML.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4180129/bin/13104_2013_3190_Fig2_HTML.gif)
![Figure 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/4180129/bin/13104_2013_3190_Fig3_HTML.gif)
Similar articles
-
An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC.Washington (DC): American Society for Microbiology; 2004. Washington (DC): American Society for Microbiology; 2004. PMID: 33001599 Free Books & Documents. Review.
-
Comparative Genome Annotation.Methods Mol Biol. 2018;1704:189-212. doi: 10.1007/978-1-4939-7463-4_6. Methods Mol Biol. 2018. PMID: 29277866 Review.
-
Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?PLoS One. 2014 Jul 11;9(7):e101850. doi: 10.1371/journal.pone.0101850. eCollection 2014. PLoS One. 2014. PMID: 25013894 Free PMC article.
-
Database indexing for production MegaBLAST searches.Bioinformatics. 2008 Aug 15;24(16):1757-64. doi: 10.1093/bioinformatics/btn322. Epub 2008 Jun 21. Bioinformatics. 2008. PMID: 18567917 Free PMC article.
-
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].Yi Chuan Xue Bao. 2004 May;31(5):431-43. Yi Chuan Xue Bao. 2004. PMID: 15478601 Chinese.
Cited by
-
Elucidating the functional roles of prokaryotic proteins using big data and artificial intelligence.FEMS Microbiol Rev. 2023 Jan 16;47(1):fuad003. doi: 10.1093/femsre/fuad003. FEMS Microbiol Rev. 2023. PMID: 36725215 Free PMC article. Review.
-
Metagenomic Analysis of Silage.J Vis Exp. 2017 Jan 13;(119):54936. doi: 10.3791/54936. J Vis Exp. 2017. PMID: 28117801 Free PMC article.
-
OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species.Nucleic Acids Res. 2015 Jul 1;43(W1):W78-84. doi: 10.1093/nar/gkv487. Epub 2015 May 11. Nucleic Acids Res. 2015. PMID: 25964301 Free PMC article.
-
The loose evolutionary relationships between transcription factors and other gene products across prokaryotes.BMC Res Notes. 2014 Dec 17;7:928. doi: 10.1186/1756-0500-7-928. BMC Res Notes. 2014. PMID: 25515977 Free PMC article.
References
-
- Moreno-Hagelsieb G. Operons across prokaryotes: genomic analyses and predictions 300+ genomes later. Curr Genomics. 2006;7:163–170. doi: 10.2174/138920206777780247. - DOI
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials