CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes
- PMID: 25977477
- PMCID: PMC4484387
- DOI: 10.1101/gr.186072.114
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes
Abstract
Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of "marker" genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.
© 2015 Parks et al.; Published by Cold Spring Harbor Laboratory Press.
Figures
Similar articles
-
Recovery of strain-resolved genomes from human microbiome through an integration framework of single-cell genomics and metagenomics.Microbiome. 2021 Oct 12;9(1):202. doi: 10.1186/s40168-021-01152-4. Microbiome. 2021. PMID: 34641955 Free PMC article.
-
Genome-resolved metagenomics using environmental and clinical samples.Brief Bioinform. 2021 Sep 2;22(5):bbab030. doi: 10.1093/bib/bbab030. Brief Bioinform. 2021. PMID: 33758906 Free PMC article. Review.
-
A machine learning-based service for estimating quality of genomes using PATRIC.BMC Bioinformatics. 2019 Oct 3;20(1):486. doi: 10.1186/s12859-019-3068-y. BMC Bioinformatics. 2019. PMID: 31581946 Free PMC article.
-
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life.Nat Microbiol. 2017 Nov;2(11):1533-1542. doi: 10.1038/s41564-017-0012-7. Epub 2017 Sep 11. Nat Microbiol. 2017. PMID: 28894102
-
Recovering complete and draft population genomes from metagenome datasets.Microbiome. 2016 Mar 8;4:8. doi: 10.1186/s40168-016-0154-5. Microbiome. 2016. PMID: 26951112 Free PMC article. Review.
Cited by
-
Roseateles caseinilyticus sp. nov. and Roseateles cellulosilyticus sp. nov., isolated from rice paddy field soil.Antonie Van Leeuwenhoek. 2024 Jun 4;117(1):87. doi: 10.1007/s10482-024-01988-4. Antonie Van Leeuwenhoek. 2024. PMID: 38833203
-
Gut microbiome shifts in people with type 1 diabetes are associated with glycaemic control: an INNODIA study.Diabetologia. 2024 Jun 4. doi: 10.1007/s00125-024-06192-7. Online ahead of print. Diabetologia. 2024. PMID: 38832971
-
Lacticaseibacillus salsurivasis sp. nov. and Companilactobacillus muriivasis sp. nov., Isolated from Traditional Chinese Pickle.Curr Microbiol. 2024 Jun 3;81(7):203. doi: 10.1007/s00284-024-03738-1. Curr Microbiol. 2024. PMID: 38831185
-
Metabolic relationships between marine red algae and algae-associated bacteria.Mar Life Sci Technol. 2024 May 8;6(2):298-314. doi: 10.1007/s42995-024-00227-z. eCollection 2024 May. Mar Life Sci Technol. 2024. PMID: 38827136 Free PMC article.
-
Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity.Nat Commun. 2024 May 31;15(1):4631. doi: 10.1038/s41467-024-49060-z. Nat Commun. 2024. PMID: 38821971 Free PMC article.
References
-
- Albertsen M, Hugenholtz P, Skarshewski A, Nielsen KL, Tyson GW, Nielsen PH. 2013. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 31: 533–538. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources