Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun;9(6):mgen001030.
doi: 10.1099/mgen.0.001030.

Noise reduction strategies in metagenomic chromosome confirmation capture to link antibiotic resistance genes to microbial hosts

Affiliations

Noise reduction strategies in metagenomic chromosome confirmation capture to link antibiotic resistance genes to microbial hosts

Gregory E McCallum et al. Microb Genom. 2023 Jun.

Abstract

The gut microbiota is a reservoir for antimicrobial resistance genes (ARGs). With current sequencing methods, it is difficult to assign ARGs to their microbial hosts, particularly if these ARGs are located on plasmids. Metagenomic chromosome conformation capture approaches (meta3C and Hi-C) have recently been developed to link bacterial genes to phylogenetic markers, thus potentially allowing the assignment of ARGs to their hosts on a microbiome-wide scale. Here, we generated a meta3C dataset of a human stool sample and used previously published meta3C and Hi-C datasets to investigate bacterial hosts of ARGs in the human gut microbiome. Sequence reads mapping to repetitive elements were found to cause problematic noise in, and may importantly skew interpretation of, meta3C and Hi-C data. We provide a strategy to improve the signal-to-noise ratio by discarding reads that map to insertion sequence elements and to the end of contigs. We also show the importance of using spike-in controls to quantify whether the cross-linking step in meta3C and Hi-C protocols has been successful. After filtering to remove artefactual links, 87 ARGs were assigned to their bacterial hosts across all datasets, including 27 ARGs in the meta3C dataset we generated. We show that commensal gut bacteria are an important reservoir for ARGs, with genes coding for aminoglycoside and tetracycline resistance being widespread in anaerobic commensals of the human gut.

Keywords: Hi-C; antibiotic resistance genes; antimicrobial resistance; gut microbiome; gut microbiota; meta3C.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflicts of interest to report.

Figures

Fig. 1.
Fig. 1.
Metagenomic chromosome conformation capture approaches. Formaldehyde is used to cross-link DNA-bound proteins before cell lysis and enzymatic digestion of the DNA. In meta3C, the cross-linked digested fragments are then ligated. In Hi-C, the digested fragments are tagged with biotin prior to ligation, enabling enrichment of ligated biotin-labelled fragments following ligation and DNA shearing. The cross-links are then removed during treatment with a protease, and the fragments undergo high-throughput sequencing.
Fig. 2.
Fig. 2.
Class-level compositions of all datasets. The reads from all datasets were taxonomically profiled using MetaPhlAn3. The stacked bars show the relative abundance (%) of each class for the classified reads. Reads that could not be classified by MetaPhlAn3 (~60 % of reads for each dataset) are excluded. For the K_HiC dataset, individuals are either neutropenic (N1-7) or healthy (H1-2) with multiple samples collected longitudinally for each individual.
Fig. 3.
Fig. 3.
Relative abundance of antimicrobial resistance genes (ARGs) in 3C/Hi-C datasets. The ARG sequences from the assemblies of each dataset were isolated, and the reads from that dataset were mapped to the ARGs (columns). The relative abundance was calculated as reads per kilobase per million mapped reads (RPKM). White cells mean the ARG was not present, and coloured cells show that the ARG was present, with the colour relating to the relative abundance of the ARG within that set of reads (log10 transformed RPKM values). Different datasets are separated by gaps in the heatmap. 3C datasets (*_3C) have rows showing the RPKM of the 3C reads mapping to the ARGs identified in the 3C metagenomic assembly. Hi-C datasets show RPKM of the shotgun reads (*_SG) or Hi-C reads (*_HiC) mapping to ARGs identified in the shotgun metagenomic assembly. The ARGs highlighted with a coloured dot are ARGs from the spike-ins in the G_3C dataset (purple, E. coli E3090; yellow, E. faecium E745).
Fig. 4.
Fig. 4.
Proportion of intercontig reads in 3C/Hi-C and shotgun reads of the same sample. The first 50 bp of each read was mapped against the corresponding assembly, and pairs where each read of the pair mapped to different contigs were labelled as intercontig reads. The y-axis shows the percentage reads that were intercontig. K_HiC average (cyan) is the average for all 43 K_HiC samples (black). G_3C (orange) and M_3C (green) did not have accompanying shotgun reads, so only the intercontig proportion for the 3C reads is shown.
Fig. 5.
Fig. 5.
G_3C reads and WGS reads mapping to genome sequences of spike-in controls Both the intercontig and non-intercontig reads for G_3C spike-in reads and WGS reads of the spike-ins ( E. coli E3090 and E. faecium E745) were mapped to their respective genomes. The genomes were annotated using Prokka and the regions in which the reads mapped to were grouped into four categories (see key). Percentages at the end of the stacked charts show the proportion of mapped reads that were assigned to intercontig/non-intercontig.
Fig. 6.
Fig. 6.
Proportions of reads mapping within the first or last 500 nt of a contig in the G_3C assembly for spike-in G_3C and WGS reads. The position of the alignment to contigs in the G_3C assembly was checked for both intercontig and non-intercontig read pairs from WGS reads and reads from G_3C that mapped to each spike-in genome ( E. coli E3090 and E. faecium E745). Orange shows the proportion of reads mapping within 500 nt of the ends of a contig. Blue shows the proportion of reads mapping more than 500 nt away from the ends of a contig.
Fig. 7.
Fig. 7.
Proportions of intercontig reads mapping within the first or last 500 nt of a contig in their respective assemblies for all datasets. The position of the alignment to contigs was checked for the intercontig reads in all datasets. Orange shows the proportion of reads mapping within 500 nt of the ends of a contig. Blue shows the proportion of reads mapping greater than 500 nt away from the ends of a contig.
Fig. 8.
Fig. 8.
Proportion of intercontig reads in 3C/Hi-C and shotgun reads before and after filtering. The first 50 bp of each read was mapped against the corresponding assembly, and pairs where each read of the pair mapped to different contigs were labelled as intercontig reads (‘Before’ on the x-axis). These were then filtered to remove intercontig reads that mapped within the first or last 500 nt of a contig (‘After’ on the x-axis). The y-axis shows the percentage reads that were intercontig. K_HiC average (cyan) is the average for all 43 K_HiC samples (black). G_3C (orange) and M_3C (green) did not have accompanying shotgun reads, so only the intercontig proportion for the 3C reads before and after filtering are shown.
Fig. 9.
Fig. 9.
Heatmap showing ARGs linked to their microbial hosts for G_3C. Contigs linked to ARG-containing contigs were taxonomically classified using Kraken2. The heatmap shows the proportion of contigs linked to each ARG that was classified as the taxon on the right. E . coli E3090 and E. faecium E745 and were spiked into the stool sample, and the ARGs that these strains carried are highlighted in yellow and purple, respectively.
Fig. 10.
Fig. 10.
Heatmap showing ARGs linked to their microbial hosts for downloaded 3C/Hi C datasets. Contigs linked to ARG-containing contigs were taxonomically classified using Kraken2. The heatmaps show the proportion of contigs linked to each ARG that was classified as the taxon on the right. Where there were multiple taxa that made up a proportion of no more than 0.02 for any ARG in that dataset, they have been grouped into ‘Other’.

Similar articles

Cited by

References

    1. Quince C, Walker AW, Simpson JT, Loman NJ, Segata N. Shotgun metagenomics, from sampling to analysis. Nat Biotechnol. 2017;35:833–844. doi: 10.1038/nbt.3935. - DOI - PubMed
    1. McInnes RS, McCallum GE, Lamberte LE, van Schaik W. Horizontal transfer of antibiotic resistance genes in the human gut microbiome. Curr Opin Microbiol. 2020;53:35–43. doi: 10.1016/j.mib.2020.02.002. - DOI - PubMed
    1. Meziti A, Rodriguez-R LM, Hatt JK, Peña-Gonzalez A, Levy K, et al. The reliability of Metagenome-Assembled Genomes (MAGs) in representing natural populations: insights from comparing MAGs against isolate genomes derived from the same fecal sample. Appl Environ Microbiol. 2021;87:e02593-20. doi: 10.1128/AEM.02593-20. - DOI - PMC - PubMed
    1. Chen L, Zhao N, Cao J, Liu X, Xu J, et al. Short- and long-read metagenomics expand individualized structural variations in gut microbiomes. Nat Commun. 2022;13:3175. doi: 10.1038/s41467-022-30857-9. - DOI - PMC - PubMed
    1. Bertrand D, Shaw J, Kalathiyappan M, Ng AHQ, Kumar MS, et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat Biotechnol. 2019;37:937–944. doi: 10.1038/s41587-019-0191-2. - DOI - PubMed

Publication types

Substances

-