Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 2;19(1):577.
doi: 10.1186/s12864-018-4949-6.

Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations

Affiliations

Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations

Timokratis Karamitros et al. BMC Genomics. .

Abstract

Background: Human cytomegalovirus (HCMV) has a double-stranded DNA genome of approximately 235 Kbp that is structurally complex including extended GC-rich repeated regions. Genomic recombination events are frequent in HCMV cultures but have also been observed in vivo. Thus, the assembly of HCMV whole genomes from technologies producing shorter than 500 bp sequences is technically challenging. Here we improved the reconstruction of HCMV full genomes by means of a hybrid, de novo genome-assembly bioinformatics pipeline upon data generated from the recently released MinION MkI B sequencer from Oxford Nanopore Technologies.

Results: The MinION run of the HCMV (strain TB40/E) library resulted in ~ 47,000 reads from a single R9 flowcell and in ~ 100× average read depth across the virus genome. We developed a novel, self-correcting bioinformatics algorithm to assemble the pooled HCMV genomes in three stages. In the first stage of the bioinformatics algorithm, long contigs (N50 = 21,892) of lower accuracy were reconstructed. In the second stage, short contigs (N50 = 5686) of higher accuracy were assembled, while in the final stage the high quality contigs served as template for the correction of the longer contigs resulting in a high-accuracy, full genome assembly (N50 = 41,056). We were able to reconstruct a single representative haplotype without employing any scaffolding steps. The majority (98.8%) of the genomic features from the reference strain were accurately annotated on this full genome construct. Our method also allowed the detection of multiple alternative sub-genomic fragments and non-canonical structures suggesting rearrangement events between the unique (UL /US) and the repeated (T/IRL/S) genomic regions.

Conclusions: Third generation high-throughput sequencing technologies can accurately reconstruct full-length HCMV genomes including their low-complexity and highly repetitive regions. Full-length HCMV genomes could prove crucial in understanding the genetic determinants and viral evolution underpinning drug resistance, virulence and pathogenesis.

Keywords: Human cytomegalovirus; MinION; Mutation; Nanopore; Quasi-species; Recombination; Variable number tandem repeats; de novo assembly.

PubMed Disclaimer

Conflict of interest statement

Not applicable.

Not applicable.

The authors declare that they have no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
De novo assembled genome characteristics. a: De novo assembled TB40/E clone Nano aligned to TB40/E clone Lisa reference (top). Forward alignment blocks are in red and reverse in purple. The deleted region of UL144 and UL145 genes is in yellow. b: GC % heat-map, white regions correspond to lower (min 34.6%) and dark-red regions correspond to higher (max 78,4%) GC content. c: read-depth across the de-novo assembled construct (confirmative re-mapping of raw reads), red line represents the average depth (100.35). d: annotated map showing the basic genome segments
Fig. 2
Fig. 2
Genome-wide similarity comparison of the de novo assembled HCMV TB40/E genome (vertical) with nine representative HCMV strains (horizontal). Forward alignments are in blue, reverse alignments are in red
Fig. 3
Fig. 3
Full genome phylogenetic analysis of the de novo assembled clone Nano (in red) and 29 representative strains. Two Chimpanzee cytomegalovirus (Panine Herpesvirus) strains were used as tree roots (orange). HCMV TB40/E strains are shown in cyan
Fig. 4
Fig. 4
De novo assembled clone Nano annotation. Red arrows represent terminal and internal repeated sequences, green arrows represent UL and US regions, purple arrows indicate the 3 repeats of “a” sequence, blue arrows represent the annotated genes and light blue arrows correspond to miscellaneous features. UL 144 and UL 145 are missing due to a 1348 bp deletion at position 177,379 within UL region, which corresponds to coordinates 181,942–183,290 of clone Lisa
Fig. 5
Fig. 5
Structural variations of alternative contigs compared to the full genome construct. Green boxes represent sequence-divergent contigs sharing the same synteny with the full genome construct. Red (relocations) purple (local misassemblies) and orange (inversion) boxes represent misaligned contigs, which indicate the presence of viral quasi-species in the culture. Reads supporting some major rearrangements are colored, in the bottom IGV screenshots
Fig. 6
Fig. 6
Genome-wide mutation analysis. a Gene-length-normalized SNPs rate. Non-synonymous mutations (missence, stop gains, stop-losses) are shown in pink, while synonymous mutations are in cyan bars. b Total synonymous / non-synonymous SNP distribution. c Cumulative Amino-acid changes (heat-map). Reference amino-acids are shown in the vertical axis. The synonymous mutations are distributed in the grey diagonal

Similar articles

Cited by

References

    1. Britt W. Manifestations of human cytomegalovirus infection: proposed mechanisms of acute and chronic disease. Curr Top Microbiol Immunol. 2008;325:417–470. - PubMed
    1. Murphy E, Shenk T. Human cytomegalovirus genome. Curr Top Microbiol Immunol. 2008;325:1–19. - PubMed
    1. Faure-Della Corte M, Samot J, Garrigue I, Magnin N, Reigadas S, Couzi L, et al. Variability and recombination of clinical human cytomegalovirus strains from transplantation recipients. J Clin Virol. 2010;47(2):161–169. doi: 10.1016/j.jcv.2009.11.023. - DOI - PubMed
    1. Cha TA, Tom E, Kemble GW, Duke GM, Mocarski ES, Spaete RR. Human cytomegalovirus clinical isolates carry at least 19 genes not found in laboratory strains. J Virol. 1996;70(1):78–83. - PMC - PubMed
    1. Dolan A, Cunningham C, Hector RD, Hassan-Walker AF, Lee L, Addison C, et al. Genetic content of wild-type human cytomegalovirus. J Gen Virol. 2004;85(Pt 5):1301–1312. doi: 10.1099/vir.0.79888-0. - DOI - PubMed

LinkOut - more resources

-