Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 8;52(12):e54.
doi: 10.1093/nar/gkae454.

Biochemical properties of chromatin domains define genome compartmentalization

Affiliations

Biochemical properties of chromatin domains define genome compartmentalization

Federica Lucini et al. Nucleic Acids Res. .

Abstract

Chromatin three-dimensional (3D) organization inside the cell nucleus determines the separation of euchromatin and heterochromatin domains. Their segregation results in the definition of active and inactive chromatin compartments, whereby the local concentration of associated proteins, RNA and DNA results in the formation of distinct subnuclear structures. Thus, chromatin domains spatially confined in a specific 3D nuclear compartment are expected to share similar epigenetic features and biochemical properties, in terms of accessibility and solubility. Based on this rationale, we developed the 4f-SAMMY-seq to map euchromatin and heterochromatin based on their accessibility and solubility, starting from as little as 10 000 cells. Adopting a tailored bioinformatic data analysis approach we reconstruct also their 3D segregation in active and inactive chromatin compartments and sub-compartments, thus recapitulating the characteristic properties of distinct chromatin states. A key novelty of the new method is the capability to map both the linear segmentation of open and closed chromatin domains, as well as their compartmentalization in one single experiment.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
4f-SAMMY-seq maps both euchromatin and heterochromatin with high resolution. (A) Schematic illustration of the 3f-SAMMY-seq vs 4f-SAMMY-seq protocols and output analysis results. From left to right, high (euchromatin) and low solubility (heterochromatin) domains correspond to genomic portions with preferential segregation in different subnuclear regions. In both ‘3f’ and ‘4f’ SAMMY-seq protocols, a sequential extraction of chromatin fractions (numbered from S1 to S4) results in a separation of euchromatic and heterochromatic regions that are then mapped to their genomic coordinates by high-throughput sequencing (applied to fractions from S2 through S4). The novel 4f-SAMMY-seq has a lighter digestion obtained with DNase I replacing TURBO DNase. Moreover, the S2 fraction is size separated into short (S2S) and long (S2L) fragments before sequencing. Specific data analysis procedures allow reconstructing chromatin domains compartmentalization in the 3D nuclear space. (B) Representative genomic region (chr13:30000000–100000000) showing genomic tracks for SAMMY-seq and chromatin marks in human foreskin fibroblasts. From top to bottom: open chromatin marks ChIP-seq enrichment profiles for H3K36me3, H3K4me1, H3K4me3, H3K27ac; reads distribution profiles for individual fractions of a representative replicate of 4f-SAMMY-seq (C004_r2) and 3f-SAMMY-seq (C004_r1); closed chromatin marks ChIP-seq enrichment profiles for H3K27me3, H3K9me3, Lamin A/C, Lamin B1. The shaded areas mark two examples of regions showing enrichment for closed (blue) or open (red) chromatin marks. (C) Reads distribution meta-profiles for individual ‘4f’ or ‘3f’ SAMMY-seq fractions in the first and second row of plots, respectively. The bottom row of plots reports the ChIP over input enrichment profile of the corresponding histone mark from ChIP-seq experiments. The average reads distribution profiles are computed over chromatin domains marked by enrichment peaks of specific histone marks, indicated on the top of each column. The domain start (DS) and domain end (DE) are indicated on the x-axis along with flanking regions coordinates. For H3K36me3 mark, we also reported the meta-profile obtained by orienting the domains according to the corresponding gene's transcribed strand. Meta-profiles for additional replicates are reported in Supplementary Figure S3a. The cartoon in Figure 1A was created with BioRender.com.
Figure 2.
Figure 2.
4f-SAMMY-seq detects chromatin domains segregation in compartments. (A) Schematic illustration of the data analysis workflow to reconstruct chromatin compartments from SAMMY-seq data. For the ‘A’ and ‘B’ compartments identification, starting from read distribution profiles for individual chromatin fractions, the Pearson correlation is computed chromosome-wise between the vectors of reads coverage across fractions for each pair of genomic bins. After performing a principal component analysis (PCA), the first component (corresponding to the first eigenvector of the matrix) is used to discriminate active ‘A’ compartment (positive values) and inactive ‘B’ compartment (negative values) (see also Materials and methods). (B) Pairwise correlation matrices of read distribution profiles for 4f-SAMMY-seq fractions (C002_r1, left) or distance normalized (observed over expected) Hi-C contact profiles (right) on a representative genomic region (chr18:21250000–80250000) computed at 250 kb genomic bins resolution. On the side of each matrix, the respective first eigenvector is reported and coloured to mark the position of active (‘A’ compartment) and inactive (‘B’ compartment) regions. Concordant domain classification in Hi-C and 4f-SAMMY-seq are marked in green (‘A-A’) for active regions and orange (‘B-B’) for inactive regions. In the 4f-SAMMY-seq eigenvector only, we marked differently the regions with a discordant compartment classification: a lighter green is used for regions classified as B in Hi-C and A in 4f-SAMMY-seq (‘B→A’ label), a lighter orange is used for the opposite case (‘A→B’ label). (C) Genome-wide pairwise Pearson correlation of chromatin compartments eigenvectors defined by Hi-C and SAMMY-seq protocol variants ‘3f’, ‘10kh’ and ‘4f’ starting from 3 million (3M), 50 000 (50K) or 10 000 (10K) cells, with individual replicates reported. (D) For each sample of panel c, the stacked barplot shows the relative percentage distribution of genomic bins associated with concordant (‘A-A’ or ‘B-B’) and discordant (‘A→B’ or ‘B→A’) compartment classification with respect to Hi-C. The same colouring and naming convention of panel B was adopted. The samples order is the same as in panel C. (E) Chromatin compartments eigenvectors for a representative genomic region (chr2:130000000–240000000). The samples order is the same as in panel C. The eigenvectors are coloured according to the same convention adopted in panel B.
Figure 3.
Figure 3.
4f-SAMMY-seq based compartments provide a detailed characterization of chromatin epigenetic status. (A) Classification of ‘A’ and ‘B’ compartments from Hi-C and 4f-SAMMY-seq and chromHMM chromatin states in human fibroblasts (see Methods) for a representative region (chr2:1–90000000). The eigenvector tracks (green and orange tracks) show compartments computed from Hi-C data (top row) and individual 4f-SAMMY-seq replicates (bottom three rows) coloured according to the same convention adopted in Figure 2B for concordant (‘A-A’ or ‘B-B’) and discordant (‘A→B’ or ‘B→A’) compartments classification. The stacked barplot in the middle row summarizes the chromatin states associated with each genomic bin: see colour legends for states labels and Supplementary Figure S6b for associated chromatin marks signatures. (B) Relative occupancy of 4f-SAMMY-seq vs Hi-C based compartments for each chromHMM chromatin state, computed as the difference in ‘A’ compartment percentage. For each chromatin state, positive values (green gradient) indicate a higher percentage of ‘A’ compartment in 4f-SAMMY-seq-based classification, whereas negative values (red gradient) indicate a higher percentage of ‘A’ compartment based on Hi-C classification (i.e. higher ‘B’ percentage based on 4f-SAMMY-seq). The size of each dot is proportional to the absolute value in the percentage difference. Chromatin states are ordered from left to right based on the percentage difference between ‘A’ compartment in 4f-SAMMY-seq replicates (average between the three replicates) and Hi-C, in a decreasing order.
Figure 4.
Figure 4.
4f-SAMMY-seq based compartments consistently classify Polycomb regulated domains. (A) Violin and box plots showing for the Hi-C dataset and individual 4f-SAMMY-seq replicates of human fibroblasts (labels on the upper margin) the distribution of Polycomb-regulated chromHMM chromatin states. The upper violin and box plots report bivalent Polycomb chromatin states (EnhBiv or BivFlnk), the bottom ones report monovalent repressive Polycomb chromatin states (ReprPC or ReprPCWk). Each set of violin plots shows the relative occupancy (percentage, y-axis values in log2 scale) of the chromatin states of interest over the 250kb genomic bins covering the entire genome. Bins are grouped by compartments classification (‘A-A’, ‘B-B’, ‘A→B’ or ‘B→A’ defined as in Figure 2B) based on individual Hi-C and 4f-SAMMY-seq samples. The number of genomic bins in each group is indicated at the bottom in parentheses (x-axis). In the overlaying boxplots, the horizontal line marks the median, the box margins mark the interquartile range (IQR), and whiskers extend up to 1.5 times the IQR. Specific data points associated with HOX gene clusters (black and red dots) are marked to show their positioning across groups and their associated high relative occupancy in Polycomb repressive and bivalent states. (B) Scatter plots showing the relationship between the rank of the first eigenvector values used to define compartments (ranks on the x-axis) and the relative occupancy by chromatin states plotted as percentage of each bin in squared root (sqrt) scale (y-axis). Each data point shows a 250kb genomic bin from the representative chromosome 2. Black dashed vertical lines mark the transition point between negative and positive values of the first eigenvector, from ‘B’ to ‘A’ compartment. Coloured solid lines highlight the overall trend (lowess regression), for the reference Hi-C dataset (violet line) and for each 4f-SAMMY-seq replicate (blue, light blue and green; labels on upper margins). Dashed salmon vertical lines and circles highlight the HOXD gene cluster bin. Upper plots show the chromatin states occupancy for bivalent Polycomb states and bottom plots for monovalent repressive Polycomb states, defined as in panel A. (C) Barplots reporting the compartments classification, for individual 4f-SAMMY-seq replicates, of genomic bins overlapping to ChIP-seq enrichment peaks for H3K27me3 alone (left barplot), H3K27me3 and H3k9me3 (center barplot), H3K27me3 and H3K4me3 (right barplot). (D) Pie chart showing the chromatin state composition for discordant (‘B→A’) bins overlapping Het chromatin state (left pie chart). The composition of concordant (‘B-B’) bins is reported for comparison (right pie chart). (E) Violin and box plots showing for the Hi-C dataset and for individual 4f-SAMMY-seq replicates (labels on the upper margin) the distribution of the Quiescent (Quies) chromHMM chromatin state. Each set of violin plots shows the relative occupancy (percentage, y-axis values in log2 scale) over 250kb genomic bins covering the entire genome and grouped by compartments classification (‘A-A’, ‘B-B’, ‘A→B’ or ‘B→A’ defined as in Figure 2B) based on individual Hi-C and 4f-SAMMY-seq samples. The number of genomic bins in each group is indicated at the bottom in parentheses (x-axis). In the overlaying boxplots, the horizontal lines mark the median, the boxes mark the interquartile range (IQR), and whiskers extend up to 1.5 times the IQR.
Figure 5.
Figure 5.
4f-SAMMY-seq allows detailed and reliable reconstruction of sub-compartments. a) Schematic illustration of the data analysis workflow to reconstruct chromatin sub-compartments from 4f-SAMMY-seq data. An adaptation of the CALDER procedure is applied on the pairwise Euclidean distance between genomic bins (see Materials and methods). For each pair of genomic bins, the Euclidean distance is computed between the vectors of reads coverage across fractions. The method then uses the ranking and clustering of domains (i.e. group of bins) based on the lowess interpolation of the first and second eigenvector of the matrix. From the projected PCA values, the ranking of domains allows discriminating eight sub-compartments from the most compacted one (B.2.2.) to the most accessible one (A.1.1). (B) Genome-wide mean reads enrichment in human fibroblasts (centred and scaled by chromosome, see Methods) in the eight sub-compartments classification defined by CALDER. Reads distribution profile for DNase-seq or ChIP-seq IP over INPUT enrichments are shown for sub-compartments obtained using Hi-C (purple points and lines) and 4f-SAMMY-seq data (blue, light blue and green points and lines for the three replicates). Sub-compartments are sorted from the most compacted (left, B.2.2) to the most accessible one (right, A.1.1).

Similar articles

References

    1. Misteli T. The self-organizing genome: principles of genome architecture and function. Cell. 2020; 183:28–45. - PMC - PubMed
    1. Lupianez D.G., Kraft K., Heinrich V., Krawitz P., Brancati F., Klopocki E., Horn D., Kayserili H., Opitz J.M., Laxova R.et al. .. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell. 2015; 161:1012–1025. - PMC - PubMed
    1. Papathanasiou S., Mynhier N.A., Liu S., Brunette G., Stokasimov E., Jacob E., Li L., Comenho C., van Steensel B., Buenrostro J.D.et al. .. Heritable transcriptional defects from aberrations of nuclear architecture. Nature. 2023; 619:184–192. - PMC - PubMed
    1. Johnstone S.E., Reyes A., Qi Y., Adriaens C., Hegazi E., Pelka K., Chen J.H., Zou L.S., Drier Y., Hecht V.et al. .. Large-scale topological changes restrain malignant progression in colorectal cancer. Cell. 2020; 182:1474–1489. - PMC - PubMed
    1. Lieberman-Aiden E., van Berkum N.L., Williams L., Imakaev M., Ragoczy T., Telling A., Amit I., Lajoie B.R., Sabo P.J., Dorschner M.O.et al. .. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009; 326:289–293. - PMC - PubMed
-