SEK: sparsity exploiting k-mer-based estimation of bacterial community composition
- PMID: 24812337
- DOI: 10.1093/bioinformatics/btu320
SEK: sparsity exploiting k-mer-based estimation of bacterial community composition
Abstract
Motivation: Estimation of bacterial community composition from a high-throughput sequenced sample is an important task in metagenomics applications. As the sample sequence data typically harbors reads of variable lengths and different levels of biological and technical noise, accurate statistical analysis of such data is challenging. Currently popular estimation methods are typically time-consuming in a desktop computing environment.
Results: Using sparsity enforcing methods from the general sparse signal processing field (such as compressed sensing), we derive a solution to the community composition estimation problem by a simultaneous assignment of all sample reads to a pre-processed reference database. A general statistical model based on kernel density estimation techniques is introduced for the assignment task, and the model solution is obtained using convex optimization tools. Further, we design a greedy algorithm solution for a fast solution. Our approach offers a reasonably fast community composition estimation method, which is shown to be more robust to input data variation than a recently introduced related method.
Availability and implementation: A platform-independent Matlab implementation of the method is freely available at http://www.ee.kth.se/ctsoftware; source code that does not require access to Matlab is currently being tested and will be made available later through the above Web site.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Similar articles
-
ARK: Aggregation of Reads by K-Means for Estimation of Bacterial Community Composition.PLoS One. 2015 Oct 23;10(10):e0140644. doi: 10.1371/journal.pone.0140644. eCollection 2015. PLoS One. 2015. PMID: 26496191 Free PMC article.
-
Quikr: a method for rapid reconstruction of bacterial communities via compressive sensing.Bioinformatics. 2013 Sep 1;29(17):2096-102. doi: 10.1093/bioinformatics/btt336. Epub 2013 Jun 20. Bioinformatics. 2013. PMID: 23786768
-
Phylogeny-based classification of microbial communities.Bioinformatics. 2014 Feb 15;30(4):449-56. doi: 10.1093/bioinformatics/btt700. Epub 2013 Dec 24. Bioinformatics. 2014. PMID: 24369151
-
Reference databases for taxonomic assignment in metagenomics.Brief Bioinform. 2012 Nov;13(6):682-95. doi: 10.1093/bib/bbs036. Epub 2012 Jul 10. Brief Bioinform. 2012. PMID: 22786784 Review.
-
High throughput sequencing methods for microbiome profiling: application to food animal systems.Anim Health Res Rev. 2012 Jun;13(1):40-53. doi: 10.1017/S1466252312000126. Anim Health Res Rev. 2012. PMID: 22853944 Review.
Cited by
-
Assessing taxonomic metagenome profilers with OPAL.Genome Biol. 2019 Mar 4;20(1):51. doi: 10.1186/s13059-019-1646-y. Genome Biol. 2019. PMID: 30832730 Free PMC article.
-
Combining 16S rRNA gene variable regions enables high-resolution microbial community profiling.Microbiome. 2018 Jan 26;6(1):17. doi: 10.1186/s40168-017-0396-x. Microbiome. 2018. PMID: 29373999 Free PMC article.
-
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2. Nat Methods. 2017. PMID: 28967888 Free PMC article.
-
ARK: Aggregation of Reads by K-Means for Estimation of Bacterial Community Composition.PLoS One. 2015 Oct 23;10(10):e0140644. doi: 10.1371/journal.pone.0140644. eCollection 2015. PLoS One. 2015. PMID: 26496191 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous