Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 16;21(1):407.
doi: 10.1186/s12864-020-06818-1.

Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions

Affiliations

Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions

Lizhi Zhou et al. BMC Genomics. .

Abstract

Background: The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names and miscellaneous RNAs, as well as uncorrected annotations of some pseudogenes. Here, we performed a systematic reannotation of the ER2566 genome by combining multiple annotation tools with manual revision to provide a comprehensive understanding of the E. coli ER2566 strain, and used high-throughput sequencing to explore how the strain adapted under external pressure.

Results: The reannotation included noteworthy corrections to all protein-coding genes, led to the exclusion of 190 hypothetical genes or pseudogenes, and resulted in the addition of 237 coding sequences and 230 miscellaneous noncoding RNAs and 2 tRNAs. In addition, we further manually examined all 194 pseudogenes in the Ref-seq annotation and directly identified 123 (63%) as coding genes. We then used whole-genome sequencing and high-throughput RNA sequencing to assess mutational adaptations under consecutive subculture or overexpression burden. Whereas no mutations were detected in response to consecutive subculture, overexpression of the human papillomavirus 16 type capsid led to the identification of a mutation (position 1,094,824 within the 3' non-coding region) positioned 19-bp away from the lacI gene in the transcribed RNA, which was not detected at the genomic level by Sanger sequencing.

Conclusion: The ER2566 strain was used by both the general scientific community and the biotechnology industry. Reannotation of the E. coli ER2566 strain not only improved the RefSeq data but uncovered a key site that might be involved in the transcription and translation of genes encoding the lactose operon repressor. We proposed that our pipeline might offer a universal method for the reannotation of other bacterial genomes with high speed and accuracy. This study might facilitate a better understanding of gene function for the ER2566 strain under external burden and provided more clues to engineer bacteria for biotechnological applications.

Keywords: Engineer bacteria; Escherichia coli ER2566; Genome reannotation; Transcriptome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flowchart depicting the pipeline and methods used for bacterial genome reannotation of the E. coli strain ER2566
Fig. 2
Fig. 2
Examples of the differences between the original RefSeq annotation and our reannotation. a In the reannotation, one pseudogene (RS16270) was identified as two genes, insA and insB, which show strong homology to the insertion element protein, IS1. b In the reannotation, two pseudogenes were re-identified as two genes (lacZ1 and lacZ2), whereas the hypothetical protein was reannotated and shown to be highly homologous with the DNA-directed RNA polymerase gene ECBD_2906 from E. coli strain BL21-DE3
Fig. 3
Fig. 3
Comparison between BL21(DE3) genome and ER2566 genome. Viewing from outside to inside rings, the outermost two rings, respectively representing plus-strand and minus-strand, show features extracted from the BL21(DE3) genome GenBank file (GenBank: CP001509.3); the next ring shows the positions of BLAST hits between the BL21(DE3) genome and the ER2566 genome detected by Blastn. The height of each line in the third ring showing BLAST results is proportional to the percent identity of the hit, and overlapping hits renders as darker lines. The next two rings show GC content and GC skew
Fig. 4
Fig. 4
Flow-chart of variant calling, combining reads mapping and de novo assembly
Fig. 5
Fig. 5
RNA-seq for variant calling under pressure from overexpression. a) The experimental design. Each group (B37, without plasmid; Y37, with pTO-T7 plasmid overexpressed) had three biological replicates. b) Visualization of BAM files of the B37 (left panel) and Y37 (right panel) in the Integrative Genomics Viewer. Based on the reannotation, one mutant was identified at position 1,094,824 C > T, located within the 3′ non-coding region of the transcription factor gene lacI. c) Mutation detected by Sanger sequencing of the B37 and Y37 genomic samples

Similar articles

Cited by

References

    1. Sezonov G, Joseleau-Petit D, D'Ari R. Escherichia coli physiology in Luria-Bertani broth. J Bacteriol. 2007;189(23):8746–8749. - PMC - PubMed
    1. Shiloach J, Fass R. Growing E-coli to high cell density - a historical perspective on method development. Biotechnol Adv. 2005;23(5):345–357. - PubMed
    1. Rosano GL, Ceccarelli EA. Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol. 2014;5. - PMC - PubMed
    1. Correa A, Oppezzo P. Overcoming the solubility problem in E. coli: available approaches for recombinant protein production. Methods Mol Biol. 2015;1258:27–44. - PubMed
    1. Fomenkov A, Sun Z, Dila DK, Anton BP, Roberts RJ, Raleigh EA. EcoBLMcrX, a classical modification-dependent restriction enzyme in Escherichia coli B: characterization in vivo and in vitro with a new approach to cleavage site determination. PLoS One. 2017;12(6):e0179853. - PMC - PubMed

Substances

LinkOut - more resources

-