Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing

doi:10.3390/ijms21239161

. 2020 Dec 1;21(23):9161.

doi: 10.3390/ijms21239161.

Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing

Zhao Chen¹, David L Erickson¹, Jianghong Meng¹

Affiliations

Affiliation

¹ Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA.

PMID: 33271875
PMCID: PMC7730629
DOI: 10.3390/ijms21239161

Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing

Zhao Chen et al. Int J Mol Sci. 2020.

. 2020 Dec 1;21(23):9161.

doi: 10.3390/ijms21239161.

Authors

Zhao Chen¹, David L Erickson¹, Jianghong Meng¹

Affiliation

¹ Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA.

PMID: 33271875
PMCID: PMC7730629
DOI: 10.3390/ijms21239161

Abstract

Oxford Nanopore sequencing can be used to achieve complete bacterial genomes. However, the error rates of Oxford Nanopore long reads are greater compared to Illumina short reads. Long-read assemblers using a variety of assembly algorithms have been developed to overcome this deficiency, which have not been benchmarked for genomic analyses of bacterial pathogens using Oxford Nanopore long reads. In this study, long-read assemblers, namely Canu, Flye, Miniasm/Racon, Raven, Redbean, and Shasta, were thus benchmarked using Oxford Nanopore long reads of bacterial pathogens. Ten species were tested for mediocre- and low-quality simulated reads, and 10 species were tested for real reads. Raven was the most robust assembler, obtaining complete and accurate genomes. All Miniasm/Racon and Raven assemblies of mediocre-quality reads provided accurate antimicrobial resistance (AMR) profiles, while the Raven assembly of Klebsiella variicola with low-quality reads was the only assembly with an accurate AMR profile among all assemblers and species. All assemblers functioned well for predicting virulence genes using mediocre-quality and real reads, whereas only the Raven assemblies of low-quality reads had accurate numbers of virulence genes. Regarding multilocus sequence typing (MLST), Miniasm/Racon was the most effective assembler for mediocre-quality reads, while only the Raven assemblies of Escherichia coli O157:H7 and K. variicola with low-quality reads showed positive MLST results. Miniasm/Racon and Raven were the best performers for MLST using real reads. The Miniasm/Racon and Raven assemblies showed accurate phylogenetic inference. For the pan-genome analyses, Raven was the strongest assembler for simulated reads, whereas Miniasm/Racon and Raven performed the best for real reads. Overall, the most robust and accurate assembler was Raven, closely followed by Miniasm/Racon.

Keywords: Oxford Nanopore sequencing; bacterial pathogen; benchmarking; genome assembly; genomic analysis; long-read assembler; long-read sequencing; whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

**Figure 1**
Whole-genome phylogenetic tree of Oxford Nanopore long-read assemblies of *Listeria monocytogenes* EGD-e with mediocre- and low-quality reads using different long-read assemblers, as aligned to 30 distantly related *L. monocytogenes* strains selected based on the single-nucleotide polymorphisms (SNPs) strategy (number of SNPs > 500) and compared with the reference genome.

**Figure 2**
Whole-genome phylogenetic tree of Oxford Nanopore long-read assemblies of *Listeria monocytogenes* EGD-e with mediocre- and low-quality reads using different long-read assemblers, as aligned to 30 distantly related *L. monocytogenes* strains selected based on the single-nucleotide polymorphisms (SNPs) strategy (number of SNPs > 500) and 30 closely related *L. monocytogenes* strains selected based on the SNP (number of SNPs < 500) and core-genome multilocus sequence typing (cgMLST) (different alleles < 500) strategies and compared with the reference genome.

**Figure 3**
Whole-genome phylogenetic tree of Oxford Nanopore long-read assemblies of *Salmonella* Bareilly CFSAN000189 with real reads using different long-read assemblers, as aligned to 30 closely related S. Bareilly strains selected based on the single-nucleotide polymorphisms (SNPs) (number of SNPs < 500) and whole-genome multilocus sequence typing (wgMLST) strategies (different alleles < 500) and compared with the reference genome.

**Figure 4**
Whole-genome phylogenetic tree of Oxford Nanopore long-read assemblies of *Campylobacter jejuni* NCTC 11168 with real reads using different long-read assemblers, as aligned to 11 closely related *C. jejuni* strains selected based on the single-nucleotide polymorphisms (SNPs) strategy (number of SNPs < 500) and compared with the reference genome.

**Figure 5**
Whole-genome phylogenetic tree of Oxford nanopore long-read assemblies of *Campylobacter jejuni* NCTC 11168 with real reads using different long-read assemblers, as aligned to 11 closely related (number of SNPs < 500) and 20 distantly related *C. jejuni* strains (number of SNPs > 500) selected based on the single-nucleotide polymorphisms (SNPs) strategy and compared with the reference genome.

**Figure 6**
Whole-genome phylogenetic tree of Oxford nanopore long-read assemblies of *Campylobacter jejuni* NCTC 11168 with real reads using different long-read assemblers, as aligned to 11 closely related (number of SNPs < 500), 20 distantly related *C. jejuni* strains (number of SNPs > 500) selected based on the single-nucleotide polymorphisms (SNPs) strategy, and 20 *Campylobacter* strains of other species and compared with the reference genome.

**Figure 7**
Pan genomes of Oxford Nanopore long-read assemblies of *Pseudomonas aeruginosa* PAO1 with mediocre- (a) and low-quality (b) reads using different long-read assemblers and 20 distantly related *P. aeruginosa* strains selected based on the single-nucleotide polymorphisms (SNPs) strategy (number of SNPs > 500) and compared with the reference genome.

**Figure 8**
Pan genomes of Oxford Nanopore long-read assemblies of *Escherichia coli* O157:H7 CFSAN076619 with real reads using different long-read assemblers and six closely related *E. coli* O157:H7 strains selected based on the whole-genome multilocus sequence typing (wgMLST) strategy (different alleles < 500) and compared with the reference genome.

See this image and copyright information in PMC

Cited by

Integrating multi-platform assembly to recover MAGs from hot spring biofilms: insights into microbial diversity, biofilm formation, and carbohydrate degradation.
Liew KJ, Shahar S, Shamsir MS, Shaharuddin NB, Liang CH, Chan KG, Pointing SB, Sani RK, Goh KM. Liew KJ, et al. Environ Microbiome. 2024 May 6;19(1):29. doi: 10.1186/s40793-024-00572-7. Environ Microbiome. 2024. PMID: 38706006 Free PMC article.
Three Rounds of Read Correction Significantly Improve Eukaryotic Protein Detection in ONT Reads.
Safar HA, Alatar F, Mustafa AS. Safar HA, et al. Microorganisms. 2024 Jan 24;12(2):247. doi: 10.3390/microorganisms12020247. Microorganisms. 2024. PMID: 38399651 Free PMC article.
Whole-genome sequencing and evolutionary analysis of the wild edible mushroom, Morchella eohespera.
Li Y, Yang T, Qiao J, Liang J, Li Z, Sa W, Shang Q. Li Y, et al. Front Microbiol. 2024 Feb 1;14:1309703. doi: 10.3389/fmicb.2023.1309703. eCollection 2023. Front Microbiol. 2024. PMID: 38361578 Free PMC article.
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.
Cosma BM, Shirali Hossein Zade R, Jordan EN, van Lent P, Peng C, Pillay S, Abeel T. Cosma BM, et al. Gigascience. 2022 Dec 28;12:giad100. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24. Gigascience. 2022. PMID: 38000912 Free PMC article.
Chromosome-level, nanopore-only genome and allele-specific DNA methylation of Pallas's cat, Otocolobus manul.
Flack N, Drown M, Walls C, Pratte J, McLain A, Faulk C. Flack N, et al. NAR Genom Bioinform. 2023 Apr 4;5(2):lqad033. doi: 10.1093/nargab/lqad033. eCollection 2023 Jun. NAR Genom Bioinform. 2023. PMID: 37025970 Free PMC article.

See all "Cited by" articles

References

1. Bertelli C., Greub G. Rapid bacterial genome sequencing: Methods and applications in clinical microbiology. Clin. Microbiol. Infect. 2013;19:803–813. doi: 10.1111/1469-0691.12217. - DOI - PubMed
1. Heydari M., Miclotte G., Demeester P., Van de Peer Y., Fostier J. Evaluation of the impact of Illumina error correction tools on de novo genome assembly. BMC Bioinform. 2017;18:374. doi: 10.1186/s12859-017-1784-8. - DOI - PMC - PubMed
1. Koren S., Phillippy A.M. One chromosome, one contig: Complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 2015;23:110–120. doi: 10.1016/j.mib.2014.11.014. - DOI - PubMed
1. De Maio N., Shaw L.P., Hubbard A., George S., Sanderson N.D., Swann J., Wick R., AbuOun M., Stubberfield E., Hoosdally S.J., et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb. Genom. 2019;5:e000294. doi: 10.1099/mgen.0.000294. - DOI - PMC - PubMed
1. Alkan C., Sajjadian S., Eichler E.E. Limitations of next-generation genome sequence assembly. Nat. Methods. 2011;8:61. doi: 10.1038/nmeth.1527. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

U01FD001418/U.S. Food and Drug Administration

LinkOut - more resources

Full Text Sources

[1] Bertelli C., Greub G. Rapid bacterial genome sequencing: Methods and applications in clinical microbiology. Clin. Microbiol. Infect. 2013;19:803–813. doi: 10.1111/1469-0691.12217. - DOI - PubMed

[2] Bertelli C., Greub G. Rapid bacterial genome sequencing: Methods and applications in clinical microbiology. Clin. Microbiol. Infect. 2013;19:803–813. doi: 10.1111/1469-0691.12217. - DOI - PubMed

[3] Heydari M., Miclotte G., Demeester P., Van de Peer Y., Fostier J. Evaluation of the impact of Illumina error correction tools on de novo genome assembly. BMC Bioinform. 2017;18:374. doi: 10.1186/s12859-017-1784-8. - DOI - PMC - PubMed

[4] Heydari M., Miclotte G., Demeester P., Van de Peer Y., Fostier J. Evaluation of the impact of Illumina error correction tools on de novo genome assembly. BMC Bioinform. 2017;18:374. doi: 10.1186/s12859-017-1784-8. - DOI - PMC - PubMed

[5] Koren S., Phillippy A.M. One chromosome, one contig: Complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 2015;23:110–120. doi: 10.1016/j.mib.2014.11.014. - DOI - PubMed

[6] Koren S., Phillippy A.M. One chromosome, one contig: Complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 2015;23:110–120. doi: 10.1016/j.mib.2014.11.014. - DOI - PubMed

[7] De Maio N., Shaw L.P., Hubbard A., George S., Sanderson N.D., Swann J., Wick R., AbuOun M., Stubberfield E., Hoosdally S.J., et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb. Genom. 2019;5:e000294. doi: 10.1099/mgen.0.000294. - DOI - PMC - PubMed

[8] De Maio N., Shaw L.P., Hubbard A., George S., Sanderson N.D., Swann J., Wick R., AbuOun M., Stubberfield E., Hoosdally S.J., et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb. Genom. 2019;5:e000294. doi: 10.1099/mgen.0.000294. - DOI - PMC - PubMed

[9] Alkan C., Sajjadian S., Eichler E.E. Limitations of next-generation genome sequence assembly. Nat. Methods. 2011;8:61. doi: 10.1038/nmeth.1527. - DOI - PMC - PubMed

[10] Alkan C., Sajjadian S., Eichler E.E. Limitations of next-generation genome sequence assembly. Nat. Methods. 2011;8:61. doi: 10.1038/nmeth.1527. - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing

Affiliation

Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources