Skip to main content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Proc Natl Acad Sci U S A. 2006 Apr 25; 103(17): 6428–6435.
Published online 2006 Mar 29. doi: 10.1073/pnas.0600803103
PMCID: PMC1564199
PMID: 16571659
Inaugural Article

Histone H3 variants and their potential role in indexing mammalian genomes: The “H3 barcode hypothesis”

Abstract

In the history of science, provocative but, at times, controversial ideas have been put forward to explain basic problems that confront and intrigue the scientific community. These hypotheses, although often not correct in every detail, lead to increased discussion that ultimately guides experimental tests of the principal concepts and produce valuable insights into long-standing questions. Here, we present a hypothesis, the “H3 barcode hypothesis.” Hopefully, our ideas will evoke critical discussion and new experimental approaches that bear on general topics, such as nuclear architecture, epigenetic memory, and cell-fate choice. Our hypothesis rests on the central concept that mammalian histone H3 variants (H3.1, H3.2, and H3.3), although remarkably similar in amino acid sequence, exhibit distinct posttranslational “signatures” that create different chromosomal domains or territories, which, in turn, influence epigenetic states during cellular differentiation and development. Although we restrict our comments to H3 variants in mammals, we expect that the more general concepts presented here will apply to other histone variant families in organisms that employ them.

Keywords: histone H3 variants H3.1, H3.2, H3.3; “barcode hypothesis”; epigenetic memory; cell differentiation

Chromatin and Its Role in Cellular Processes

Every eukaryotic cell contains genetic information in the form of DNA that is compacted to varying degrees in a confined nuclear space. However, DNA is packaged in such a way that enables its readout, replication, and repair in response to cellular needs and external stimuli. This condensation is achieved by an intimate interaction between DNA and histone proteins to form chromatin. The fundamental unit of chromatin is the nucleosome particle, consisting of core histone proteins (H2A, H2B, H3, and H4) around which the DNA is wrapped. Chromatin is often broadly divided into two cytologically distinct fractions: euchromatin, which is generally permissive for transcription, and heterochromatin, which is largely repressive. Two basic varieties of heterochromatin exist, constitutive and facultative; DNA within constitutive heterochromatin is obligately silenced; facultative heterochromatin is silenced only in certain contexts.

Relevant to our proposed “H3 barcode hypothesis” is the extent to which the chromatin fiber is constant or variable. Constancy is provided by the nearly universal nucleosomal packaging theme of histones and DNA in all eukaryotes. Variation is provided by subtle changes in this packaging theme that provide “instructions” as to how the DNA template is to be “read” when needed. Histone proteins are, for example, well known to be extensively modified by a vast array of covalent modifications on “external” (N- and C-terminal tails) as well as “internal” (histone-fold) domains, often leading to complex modification patterns that correlate closely with various states of gene expression or other DNA-templated processes. This staggering number of posttranslational modifications (PTMs) has prompted theories as to how these chemical marks might be translated into meaningful biological responses (1, 2). The “histone code” hypothesis states that a specific histone modification, or combinations thereof, can affect distinct downstream cellular events by altering the structure of chromatin (cis mechanisms) or by generating a binding platform for effector proteins (trans mechanisms). Such effectors specifically recognize particular PTM(s) and initiate events that ultimately lead to downstream events, such as gene activation or silencing. Tests of this hypothesis, as well as extensions of it (3), are gaining experimental support (e.g., refs. 4 and 5), although alternative views have been expressed (6, 7). Despite these uncertainties, emerging evidence underscores elaborate mechanisms for introducing variation, covalent and noncovalent, into the chromatin polymer (reviewed in ref. 8). The challenge remains as to how this variation is converted into meaningful biological readout.

Histone H3 Variants and Their Evolution

With the exception of H4, all core histone proteins have variant counterparts, which often differ in surprisingly few amino acids (reviewed in ref. 9). Histone genes encoding these variants can be classified into three main subtypes on the basis of their expression pattern and genomic organization (10, 11): replication-dependent (RD), replication- and cell cycle phase-independent (RI), and tissue-specific (TS) histones. RI expression of histone genes reinforces the general view that histone proteins evolved to participate actively in DNA-templated processes rather than to serve simply a passive DNA-packaging role (see below). Nowhere is the concept of histone variants better illustrated than with the family of H3 histones.

Most eukaryotes express a centromere-specific H3 variant (Saccharomyces cerevisiae, Cse4; Drosophila, CID; and Homo sapiens, CENP-A) that is evolutionarily well conserved in its globular core region but not in its N-terminal tail (reviewed in ref. 12) and is essential for cell survival because of its fundamental role in centromeric function during mitosis (13). Interestingly, during evolution, additional genes encoding H3 variants have emerged (Fig. 1A). For example, outside of the centromeric H3 variant, the unicellular yeast S. cerevisiae possesses only H3.3, a H3 variant that is expressed and incorporated into chromatin in a RI fashion and associated in higher eukaryotes with transcriptional activation (see below). Although budding yeast contains well defined “silent” chromatin, several hallmark features of constitutive heterochromatin in higher eukaryotes (e.g., H3 K9, and K27 methylation) have yet to be observed in S. cerevisiae (14). This observation correlates well with the presence of only a H3.3 variant and the streamlined gene-rich composition of the yeast genome. Interestingly, the fission yeast Saccharomyces pombe contains one H3, with characteristics from both H3.3 and H3.2, a finding consistent with this yeast having constitutive heterochromatin more typical of higher organisms (15). Organisms such as plants, flies, frogs, and birds contain, in addition to H3.3, another H3 variant H3.2 that differs in only four amino acids from H3.3 (Fig. 1B); H3.2 is expressed and incorporated into chromatin in a RD fashion. Only in mammals have two additional H3 variants evolved: H3.1 and a testis-specific H3.1 variant (H3.1t) (Fig. 1A). H3.1 differs from H3.2 in only one amino acid (amino acid 96: cysteine/serine, respectively) and is also expressed in RD fashion, whereas H3.1t is expressed only in testis and has four additional amino acid substitutions when compared with H3.1 (see Fig. 1B). Are these modest changes in primary sequence among H3 variants unimportant, a likely consequence of evolutionary “drift?” Alternatively, the small number of amino acid changes in these H3 variants lead to unique protein structures and, in turn, to unique nucleosomal architecture and chromosomal domains that might govern H3 variant-specific biological functions (as is the case for centromere-associated H3s) (16). Future studies aimed at determining the x-ray structures of nucleosomes containing different histone variants may provide structural insights into their effects on nucleosome stability and organization.

An external file that holds a picture, illustration, etc.
Object name is zpq0110615940001.jpg

H3 variants in different organisms. (A) Schematic of evolutionary appearance of histone H3 variants. All organisms express a centromere-specific H3 variant (CENP-A, filled blue circle). In addition to the centromeric H3 variant, the following H3 variants are expressed in these organisms: S. cerevisiae contains only H3.3 (blue gradient circle); S. pombe expresses a hybrid H3 protein that contains amino acids characteristic for H3.3 and H3.2; Arabidopsis thaliana, Xenopus laevis, and Drosophila melanogaster (for example) express H3.3 and H3.2 (blue circle with white dots); mammals such as Mus musculus and H. sapiens express H3.3, H3.2, H3.1 (white circle with blue dots), and a testis-specific H3.1t (white circle with blue stripes) variant of unknown function. H3.3 has been associated with euchromatin and transcriptional activation. H3.2 and H3.1 might localize to heterochromatin and are involved in transcriptional silencing. (B) Alignment of human noncentromeric histone H3 variants. Differences in amino acid sequence among human H3.3, H3.2, H3.1, and H3.1t are shown in white boxes. Cysteine residues are highlighted in red (Cys 96 in dark red and Cys 110 in pink). Identical amino acids are shown in gray. TS, tissue-specific. The region where most amino acid differences between the variants are found is underlined as a potential chaperone recognition domain (see text for details), and the chaperones binding to H3 variants are depicted below.

The literature on H3 variants does not contain a universal nomenclature for these variants, and, therefore, we propose to adopt the following convention: histone H3 protein containing S31, A87, I89, and G90 will be called H3.3; H3 with A31, S87, V89, M90, and S96 will be called H3.2; and H3.1 has the sequence of H3.2, with the exception of position 96, where it contains a cysteine. Amino acids 87–90 in H3.3 have been shown to be important for RI incorporation into chromatin (17), and these data suggest that this region might act as a “chaperone recognition domain” where HIRA binds to H3.3 and CAF-1 to H3.1 (see below and ref. 18). It is as yet unknown whether H3.2 binds to a different chaperone and whether amino acid position 96 plays any role in this potential chaperone recognition domain (Fig. 1B).

Elegant experiments have shown that H3.3 is associated with transcriptionally active gene loci and is enriched in covalent modifications associated with gene activation in flies, plants, and humans (17, 1921). In contrast, in Drosophila and Arabidopsis, H3.2 has been shown to be enriched in marks that are associated with gene silencing (19, 20). These observations suggest that, during evolution, organisms draw on different profiles of physiologically relevant PTMs but also selective employment (recruitment and replacement) of different histone H3 variants, a concept well articulated by Henikoff and colleagues (22). Because H3.1 and H3.2 differ by only a single amino acid, most studies tend to group these variants as one. However, recent results provide evidence that human H3.1, H3.2, and H3.3 differ in both their expression and PTM patterns as follows: H3.3 is enriched in PTMs associated with gene activation (hyperacetylation and dimethylation of K36 and K79), H3.2 is enriched in PTMs associated with gene silencing (K27 di- and trimethylation), and H3.1 is enriched in PTMs associated with gene activation (K14 acetylation) and gene silencing (K9 dimethylation), suggesting that these mammalian H3 variants may, indeed, have separate biological functions (23). These studies underscore a general conclusion: Remarkably similar histone proteins may vary considerably in their expression and PTM profiles. Determining how these differences translate into different biological functions and, notably, whether different functions, indeed, exist for the closely related H3.1 and H3.2 remains a challenge for future research.

The mechanism(s) by which histone variants and their PTMs are transmitted through the cell cycle also remains unsolved. Depending on the precise mechanism of nucleosome assembly at the time of DNA replication, histone variants may provide a bridge for the transmission of epigenetic information from one cell or one sexual generation to the next (18). If, for example, the incorporation of histone variants into replicating chromatin is nonrandom, we envision that the variants may provide potential “backup” for the more labile histone PTMs by playing a role in the establishment of “epigenetic memory.” Central to this concept is the general view that H3 variants can impart structural differences to individual nucleosomes, nucleosomal arrays, or higher-order chromatin domains that contain them before PTMs are added (or removed) (24). Below, we present several ideas for how such differences might occur, even though only a small number of amino acid differences exist between H3 variants.

Cysteines of H3 Variants and Their Potential Role in Nuclear Architecture

Well established in the literature, but relatively underappreciated, is the fact that most members of the histone H3 family contain one or more cysteine(s) in their protein core and that this feature is a hallmark property of histone H3; all other histone proteins lack cysteine (Fig. 1B). Cysteine is one of the most rarely used amino acids in nature (1.9% occurrence in proteins) (25), suggesting that it plays a specialized role in the function of proteins that contain it. Equally well established is the fact that cysteines can form disulfide bonds under oxidative conditions and are involved in the homotypic or heterotypic dimerization and oligomerization of proteins. As shown in Fig. 1B, the histone H3 variants (except H3 in S. cerevisiae) contain one cysteine at position 110 that is located in their a2 helix, the region where both H3 proteins are closely apposed in the nucleosome core particle (26). The region immediately surrounding amino acid 110 is important to hold together two histone H2A–H2B–H3–H4 tetramers, because C110E mutations, for example, destabilized the H3–H3 hydrophobic four-helix bundle tetramer interface in vitro (27). It is not yet clear what role, if any, cysteine 110 plays in this process in vivo. We propose that Cys-110, common in essentially all H3 variants, forms an “intra”-disulfide bond with H3 in the same nucleosome under oxidative conditions, adding stability to the H3–H4 tetramer (Fig. 2A). In support, a crosslinked H3-H3 octamer can still form a nucleosome in vitro (28). No cysteine exists in S. cerevisiae H3, and giving yeast an artificial cysteine in place of its serine at 110 does not appear to have clear phenotypic consequences (29). We note that budding yeast lacks many of the better known heterochromatin marks and related machinery (i.e., K9 methylation in H3, HP1-like homologues, etc.). Thus, the tentative conclusion that cysteine utilization is unimportant based solely on experiments conducted in budding yeast may not be warranted. We look forward to the generation of H3 cysteine mutants in organisms that use more classical heterochromatin.

An external file that holds a picture, illustration, etc.
Object name is zpq0110615940002.jpg

Potential usage of H3 variant-specific cysteines 110 and H3.1-specific cysteine 96. (A) H3 cysteine 110 forms a potential intramolecular disulfide bond (light red box) with H3′ cysteine 110 in the same nucleosome (for details, see text). For simplicity, only the H3–H4 tetramer is shown as top view (Left). All mammalian H3 variants contain cysteine 110 and can potentially participate in disulfide bonding. (Right) H3–H4 dimers. (B) H3.1 cysteine 96 potentially forms intermolecular disulfide bonds (dark red box) with H3.1′ cysteines 96 in different nucleosomes, leading to chromatin condensation and heterochromatin generation (for details, see text). (C) H3.1 cysteine 96 is envisioned to potentially form disulfide bonds (dark red box) with cysteine in LBR on the nuclear envelope or with a cysteine in an as yet unknown protein (X?) in the nucleus (for details, see text). We speculate that chromatin containing H3.1 nucleosomes is preferentially located near the nuclear membrane and irreversibly rendered for transcription regardless of PTMs.

Although the extent to which the nucleus contains an oxidizing or reducing environment is not well established, redox-sensing mechanisms appear to play important roles in the nucleus. Certain transcription factors, for example, NF-κB, contain a cysteine that has been shown to participate in intermolecular disulfide formation (30) and must be in a reduced state in order for NF-κB to bind to DNA. Reduction is achieved by the action of molecules that are unique to the nucleus (31). In contrast, other transcription factors have an increased DNA-binding affinity under oxidative conditions (32), lending support to the general notion that physiologically relevant, redox-sensitive mechanisms may occur inside the nucleus.

It is intriguing to revisit earlier literature (33, 34) aimed at determining whether the cysteines in histone H3 variants “sense” changes in the redox state of the nucleus. If so, does the proximity of the two cysteines at the interface between homotypic H3 dimers within each nucleosome play a stabilizing role in the architecture of the chromatin polymer that, in turn, impacts on the regulation of gene expression? Roughly 20 years ago, Allfrey and colleagues (35) hypothesized a meaningful difference between euchromatin and heterochromatin, as assayed by accessibility to sulfhydryl reagents, which can form disulfide bonds with exposed cysteines under oxidative conditions. Transcriptionally active regions were labeled preferentially with sulfhydryl-specific reagents, whereas nucleosomes in heterochromatin and nontranscribed regions were not. Moreover, these reagents preferentially bound to the cysteines in chromatin fractions enriched for hyperacetylated H3, suggesting that transcriptional activity “opens” the otherwise more tightly compacted chromatin, exposing the H3 cysteine so that it can be bound by sulfhydryl-reactive molecules (36). These observations correlate well with results showing that exposure of fibroblasts to mercury leads to the accumulation of this metal into euchromatin but not into heterochromatin (37, 38). Enrichment of “active,” hyperacetylated chromatin, obtained by virtue of its ability to bind to mercury-containing columns, formed the basis of several intriguing experiments, including fractionation of yeast chromatin with an artificial cysteine at position 110 in place of its natural serine (29).

These data suggest that cysteine 110 in H3 is more accessible to sulfhydryl-reactive reagents in euchromatin and may be more buried in heterochromatin, providing a potential molecular marker, underscoring a physical change in the nature of higher-order chromatin structure that may reflect different physiological states. It remains unclear whether the inaccessibility of cysteine 110 in transcriptionally silent regions is an indirect consequence of chromatin compaction. Alternatively, a more direct effect is possibly due to a disulfide bonding between both cysteines 110 in the two H3s in the same nucleosome that, in turn, compacts nucleosomal and higher-order structures (Fig. 2A). Determining the extent to which H3 “oxidation/reduction” occurs in vivo, if at all, remains a worthwhile challenge for future studies.

Interestingly, two mammalian histone H3 variants, H3.1 and H3.1t, contain an additional cysteine 96 in their protein-core region besides the more highly conserved cysteine 110 discussed above (Figs. 1B and and22A). Because cysteine 96 is likely located on the protein’s surface (26), we speculate that it may play an unappreciated role in chromatin compaction and gene silencing by its ability to form disulfide bonds with other H3s in different nucleosomes or with other cysteine-containing proteins under oxidative conditions or in the presence of an as yet undetermined oxidizing molecule(s). Several scenarios for cysteine 96 can be envisioned:

(i) The H3.1-specific histone chaperone CAF-1 (18) may specifically recognize the region containing cysteine 96 in H3.1 as part of a chaperone-specific replacement mechanism that serves to direct H3.1 to target genomic loci (see below and Fig. 1B). In Drosophila, the region of H3.3 that differs most from that of H3.2 (amino acids 87, 89, and 90) is important for RI incorporation (17). These findings suggest that this region of H3 is important for the binding of specialized histone chaperones, and we speculate that cysteine 96 plays a role in this process, thereby distinguishing H3.1 from H3.2 and H3.3. To our knowledge, little is known as to how histone-specific chaperones (e.g., CAF-1 versus HIRA; see ref. 18) recognize the appropriate target histones, nor is it known whether H3.2 is escorted to chromatin by its own unique chaperone (see Fig. 1B).

(ii) Nucleosomes that contain H3.1 might bind to other H3.1-containing nucleosomes through internucleosomal disulfide bonds between cysteines 96. We envision that this event would serve to provide additional stability to higher-order nucleosomal contacts and may provide an explanation for H3.1-mediated condensation of heterochromatin (Fig. 2B). The concept of cysteine 96-mediated disulfide “bridging” suggests that H3.1 might play a unique role in the formation of constitutive heterochromatin that stably represses transcription through the generation of H3.1-containing oligonucleosomes. Here, we note that the formation of H3 dimers (H3.1, H3.2, and H3.3) and H3 oligomers (H3.1 only) occurs, at least in vitro, under oxidative conditions and is inhibited in a reducing environment (S.B.H. and C.D.A, unpublished observations). The in vivo significance of these findings, if any, remains unclear.

(iii) H3.1 might form disulfide bonds with other nuclear cysteine-containing nonhistone proteins (Fig. 2C). One attractive candidate is the nuclear membrane-associated protein lamin B receptor (LBR); other disulfide “partners” (“X”) are also possible (Fig. 2C). LBR has been shown to bind distinct heterochromatin-enriched fractions (39). Moreover, Makatsori and coworkers (23) found that LBR-associated purified fractions contain histone H3 enriched in PTMs associated with transcriptional silencing similar to those that we have observed on H3.1. Additionally, LBR binds heterochromatin as a higher oligomer. Interestingly, another study reports the formation of a higher-order complex including H3, H4, LBR, and heterochromatin protein 1 (HP1) (40) that has been found to interact with histone H3 methylated at lysine 9 (41). We have shown that H3.1 is enriched in K9 dimethylation, suggesting that H3.1, but not H3.2 or H3.3, might be the H3 variant that selectively binds HP1 and interacts with LBR at the nuclear envelope. It remains to be determined whether LBR-bound heterochromatin contains only H3.1 and, if so, whether cysteine 96 is important for the establishment of nuclear membrane-associated heterochromatin. In conclusion, we speculate that the unique cysteines in H3 variants might be important for nucleosomal and chromatin higher-order structures in ways that remain to be determined. In turn, we speculate that these structures determine, directly or indirectly, transcriptional regulatory states and distinct nuclear domains or compartments (see below).

Histone H3 Variants and Epigenetic Memory

During the development of multicellular organisms, cells differentiate by changing their gene expression profiles in response to stimuli or environmental cues. Long after these external stimuli are gone, “cellular memory” mechanisms enable cells to remember their chosen fate over many cell divisions (reviewed in ref. 42). Chromatin has long been suspected to play a major role in these mechanisms, but how an epigenetic memory, defined by networks of inherited sets of expressed and silenced genes, is faithfully transmitted to daughter cells during each S-phase remains unresolved. We favor the general view that histone variants, especially H3.1, H3.2, and H3.3, contribute to not only gene expression and silencing events, but also to the maintenance of epigenetic inheritance. In this view, histone PTMs alone cannot explain the establishment of epigenetic memory during several cell divisions. We propose that histone H3 variants contribute to “indexing” the genome into functionally separate domains (e.g., euchromatin, facultative heterochromatin, and constitutive heterochromatin) that, in turn, establish and maintain epigenetic memory for each individual cell type. If correct, one requirement for H3 variants to play a major role in epigenetic inheritance is that nucleosomes contain “homo”-dimers of the same H3 variant, which are deposited by different chaperones (see Fig. 3). In support, Nakatani and colleagues (18) provided evidence that mammalian histone H3 variants H3.1 and H3.3 are incorporated into chromatin by separate chaperones (CAF-1 and HIRA, respectively). Once properly deposited into chromatin, H3 variants must be read by mechanisms that remain unclear but are likely to involve PTMs (see below).

An external file that holds a picture, illustration, etc.
Object name is zpq0110615940003.jpg

Epigenetic memory and H3 variants: graphic of different models of epigenetic inheritance (for details, see text). Nucleosomes contain two of H3.3 (blue gradient circle), H3.2 (blue circle with white dots) or H3.1 (white circle with blue dots), and H4 (yellow circle). N-terminal tails of H3 variants are posttranslationally modified: H3.3, active PTMs (green flag); H3.2, silencing PTMs (red flag); H3.1, silencing PTMs that differ from those observed on H3.2 (orange flag). Outside of S-phase, H3.3 can be deposited into chromatin in a RI manner [as either H3.3–H4 tetramers (Left) or H3.3-H4 dimers (Right)] to activate gene transcription immediately, as proposed by Henikoff and colleagues (44). The conservative inheritance model proposes that, during replication, H3–H4 tetramers are distributed on daughter strands in a random fashion. (Left) H3 variant-specific chaperones deposit H3–H4 tetramers onto daughter strands to fill in the gaps, distributing H3 variants by potentially sensing adjacent H3 variants on the same daughter strand. (Right) The semiconservative model of replication, as proposed by Tagami et al. (18), is shown. During replication, nucleosomes are separated into two H3–H4 dimers that are distributed equally onto daughter strands. H3 variant-specific chaperones deposit H3.3–H4 dimers (HIRA), H3.1–H4 dimers (CAF-1), and H3.2–H4 dimers (unknown, ?) to histone dimers on the daughter strands forming homogenic nucleosomes.

Different models have been proposed to explain how epigenetic memory can be achieved (reviewed in ref. 43). Henikoff and coworkers (44) recently proposed that histone states are not actively duplicated but are reestablished each cell cycle by active transcription and new deposition of histone variants, in particular H3.3 (Fig. 3, see nonreplicating DNA). Although transcription-coupled histone-variant deposition may function to establish and reestablish active euchromatin, it is unlikely to be the sole means of epigenetic inheritance, because it does not account for the inheritance of silenced chromatin. The timing of the replication of different chromatin states during S-phase might also play an important role in establishing epigenetic memory. It is well known that transcriptionally active chromatin is replicated in early S-phase, whereas heterochromatin is replicated in late S-phase (reviewed in ref. 45). It will also be interesting to determine whether facultative and constitutive heterochromatin replicate at different times during late S-phase, which might coincide with the expression of H3.2 and H3.1 and/or their specific chaperones, therefore providing one regulatory step in achieving epigenetic memory.

Much experimental evidence points toward another model of inheritance, the conservative model. This model suggests that intact parental nucleosomal cores are most likely dispersively segregated to daughter strands (46, 47) (Fig. 3 Left). In this model, the maintenance of epigenetic inheritance is difficult to envision but might be achieved by the replication timing differences between chromatin stages or H3 variant-specific chaperones that sense adjacent H3–H4 tetramers on the same daughter strand. Another possibility is that the topology of DNA differs among different H3 variant-H4 tetramers, and, therefore, only specific H3 variant-containing nucleosomal cores are deposited onto the DNA during replication. Although not impossible, little, if any, evidence exists to support these scenarios.

In contrast, the semiconservative inheritance model proposes that nucleosomes are “split” into H3–H4 dimers that are distributed to each daughter strand (Fig. 3 Right). Although controversial (see ref. 43), these findings suggest that H3 and H4 may be deposited into chromatin as a dimeric unit rather than as a tetrameric unit, as has been proposed (48). The semiconservative model suggests that parental (H3–H4)2 tetramers are dissociated into two H3–H4 dimers during DNA replication and segregated evenly onto daughter DNA strands (18). One consequence of this model is that each daughter strand obtains one posttranslationally modified parental H4–H3 variant dimer, thereby acquiring epigenetic memory. New H3.1–H4 dimers are deposited in chromatin through CAF-1 chaperone (RD-expressed), resulting in (H3.1–H4)2 tetramers. In contrast, HIRA is believed to deposit newly synthesized H3.3–H4 dimers into chromatin, forming (H3.3–H4)2 tetramers on the daughter strands.

We speculate that there may be an additional, as yet unidentified, H3.2-specific histone chaperone that deposits only H3.2–H4 dimers (or tetramers). As discussed above, whether serine 96 in H3.2 (as compared with cysteine 96 in H3.1; see Fig. 1B) is an important recognition site for this hypothetical H3.2 chaperone is not known. Regardless of which model is more correct, CAF-1 may recognizes both H3.1 and H3.2. Here, the site-specific incorporation of each variant into chromatin would depend on the template variant in the parental H3–H4 dimer, the time when H3.1 and H3.2 are expressed during late S-phase, etc. “Daughter” mononucleosomes would then be completed with the addition of H2A–H2B dimers donated by other chaperones or exchange machinery (reviewed in refs. 49 and 50). Several chromatin-remodeling complexes, including the SWI/SNF, RSC, and ISWIb complexes, can catalyze the exchange of H2A–H2B dimers between chromatin fragments in an ATP-dependent reaction (51). On the other hand, the H2A.Z variant has recently been shown to be incorporated into chromatin by a specialized ATP-dependent nucleosome remodeling complex; the SWR1 complex, which consists of 13 subunits, including the Swi2/Snf2-related ATPase Swr1 (52). It remains to be seen whether other H2A variants, such as H2A.X, macroH2A, and H2A.Bbd are also incorporated into chromatin by other specialized chaperones and whether all these H2A variants might then be pairing specifically with different H3 variants in one nucleosome. After completion of the newly assembled chromatin, we envision that appropriate histone-modifying enzymes will add or remove PTMs, maintaining a specific histone code for each particular chromosomal region.

We suggest, then, that H3.1, H3.2, and H3.3 have different biological functions, based on differences in cell and tissue-specific expression patterns and PTMs (23). We favor the general view that histone variants index select chromosomal regions by using selective chromatin-assembly mechanisms of the type described above, regardless of which model of inheritance is actually happening in the cell. Once in place, we envision that variant nucleosomes, marked by different PTMs, influence gene expression and nuclear architecture and, therefore, achieve persistent epigenetic memory over multiple cell generations.

Histone H3 Variants and Cell Lineage Restriction: The H3 Barcode Hypothesis

Adult mammals contain hundreds of cell types distributed among specialized tissues and organs, each with an identical DNA content. Yet, each of these cell types has a unique pattern of gene expression. In simple terms, genes behave in three ways during development: Some genes are subject to lineage-dependent activation events, such as PAX-5, PU-1, E2A, and EBF, leading to the generation of cell-type-specific precursors, in this case, B cell precursors, in the hematopoietic system (53), whereas others undergo lineage-dependent silencing events, such as X-chromosome inactivation and the silencing of embryonic genes such as Oct 4 (54). Lastly, the expression of housekeeping genes is maintained constitutively.

Stem cell and animal cloning (nuclear transfer) experiments hint that much of the molecular basis of tissue-specific gene expression and developmental potential is deeply rooted in the details of chromatin structure and epigenetic mechanisms (55). In addition, the intranuclear “architecture” of chromatin likely has a bearing on its regulation. Transcriptionally inactive genes, for example, reside in a position near the nuclear periphery (56), or interphase centromeres (57), whereas active genes are maintained near the center of nuclei. The nuclear location of genes may therefore affect their transcriptional status, and some evidence suggests that this is a dynamic process involved in cell differentiation (58).

The extent to which H3 variants factor into these events, if at all, is largely unexplored. We propose that histone H3 variants play a major role in cell differentiation and cell lineage restriction, and we put forward a speculative hypothesis, the H3 barcode hypothesis, to explain how this may occur. Our model suggests that mammals have evolved an additional way of regulating their genetic information over many cell generations. We propose that the mammalian genome is indexed by histone H3 variants (Fig. 4A) in a nonrandom fashion that reflects the assembly mechanisms and “personalized” chaperones and exchange factors described above (Fig. 3). We envision that H3.3 is incorporated into transcriptionally active regions, whereas, in contrast, H3.2 is deposited in transcriptionally silent areas that can be reversibly activated, depending on cellular needs (facultative heterochromatin). In our model, H3.1 might then be localized to genes that are constitutively silent or to genomic loci containing little or no apparent protein-coding information, whereas CENP-A is localized to highly specialized centromeres (Fig. 4A). If correct, we envision that this barcoding of genomic DNA with histone H3 variants would be subjected to change during stem cell differentiation, when chromatin-remodeling events take place. We further speculate that these remodeling pathways impart a memory to cell lineage-dependent gene expression in light of the epigenetic inheritance models presented above.

An external file that holds a picture, illustration, etc.
Object name is zpq0110615940004.jpg

The H3 barcode model to index genomic information and ensure epigenetic memory. (A) Theoretical visualization of H3 variants in two chromosomes (1, 2) of A and B cell types show different banding patterns (white with blue dots, H3.1; blue with white dots, H3.2; blue, H3.3). This H3 variant barcode differs from chromosome to chromosome and cell type to cell type. In this model, H3.1 localizes to constitutive heterochromatin, H3.2 to facultative heterochromatin, and H3.3 to euchromatin. (B) Graphical combination of the three biological codes: the genetic code, the H3 barcode, and the histone code. DNA contains genetic information in the form of genes (white boxes) that have to be activated or silenced at appropriate times and noncoding regions, such as centromeres, telomeres, and satellites (dotted line). Actively transcribed genes contain H3.3 (blue gradient circle) in their chromatin, whereas silenced genes have H3.2 (blue circle with white dots) incorporated. A majority of DNA does not contain any meaningful genetic information and also genes, which are constitutively silent. These genomic regions are indexed by the presence of H3.1 (white circles with blue dots) in the chromatin. The next regulatory step to ensure proper gene expression is the regulation of genes with posttranslational histone modifications (green flag, activation PMTs; red and orange flags, different silencing PMTs). We propose that short-term alterations in gene expression is achieved by the employment of specialized PMTs (e.g., acetylation), but long-term establishment (epigenetic memory) of gene expression involves more stable histone modifications as well as the incorporation of the appropriate histone H3 variants.

In considering the H3 barcode hypothesis, we propose that patterning of histone PTMs would serve to regulate the immediate responses of genes to external stimuli and maintain networks of gene expression or silencing over short developmental time periods (Fig. 4B). Our hypothesis suggests that genes are switched “on” or “off” according to their PTM pattern during one cell cycle. For long-term memory (many cell generations) of a cell’s particular “epigenetic state;” however, we propose that the selective incorporation of histone H3 variants into various chromosomal domains plays a role in establishing gene-expression profiles exhibited by a particular cell type at a particular point in time. In support of this hypothesis, Loppin and coworkers (59) have recently suggested that H3.3 is incorporated by HIRA chaperone into the chromatin of the male pronucleus in Drosophila, thereby replacing protamines and leading to sperm nucleus decondensation. In the mouse, H3.1 is absent in the male pronucleus, which is largely decondensed (60), a finding also consistent with our hypothesis. In addition, a recent study from Felsenfeld and colleagues showed that, in chicken erythroid cells, exogenous H3.3 expression resulted in increased expression of folate receptor and VEGF-D genes, whereas H3 (H3.2) caused decreased expression of these genes, therefore implying a difference in function for H3 (H3.2) and H3.3. All of the above studies support the general notion that H3.3 is associated with decondensed open chromatin, whereas mammalian H3.1 and chicken H3.2 mark heterochromatin that is in a “closed” state. An important feature of our hypothesis is that chromatin structures change during cell differentiation through the selective incorporation of different histone H3 variants.

By combining all of the above ideas and models, we propose that it should be possible to distinguish cell types by the genomic localization of H3.1, H3.2, and H3.3, producing a pattern or barcode of staining along chromosomal regions much like characteristic band/interband regions of Drosophila polytene chromosomes (Fig. 4A). In this speculative model, chromosomes from cell type A contain H3 variants in different genomic loci than chromosomes from cell type B, because different sets of genes are activated and/or silenced by selective deposition or exchange of appropriate H3 variants. Additionally, we propose that each chromosome in any given cell type should have a different distribution of the H3 variants along their chromosome arms, outside of more constant chromosomal landmarks such as centromeres and telomeres that are also likely marked with their own H3 variant signatures (e.g., CENP-A at centromeric regions). One test of the H3 barcode hypothesis would be to display the different H3 variants with differentially marked or colored tags, asking whether a barcode pattern is revealed that differs from chromosome to chromosome and cell type to cell type. Ultimately, chromatin immunoprecipitation (ChIP) assays, combined with whole-genome microarray and tiling analyses (ChIP by chip; for one example, see ref. 62) will provide a powerful test of these ideas, when the appropriate immunological reagents become available for these highly conserved proteins. As mentioned above, histone genetics in mammalian models presents a challenge for those histone genes that are present in high copy numbers, such as H3.1 and H3.2. However, because H3.3 is encoded by only two genes in mouse and human (H3.3A and H3.3B, each encoding identical H3.3 proteins), histone genetics with this H3 variant may be possible.

In conclusion, we speculate that at least three different biological codes, the genetic code, a PTM histone code, and a H3 barcode (and potentially other histone variant barcodes), may act together to ensure proper gene activation and silencing (Fig. 4B). We favor the view that, at least in mammalian cells, histone H3 variants index the genome as follows: H3.3 is largely, if not exclusively, associated with euchromatin, H3.2 with facultative heterochromatin, and H3.1 with constitutive heterochromatin (although we note that there might be exceptions to this rule). We propose that this barcoding of genomic regions with appropriate H3 variants ensures long-term cellular memory of the transcriptional status of genes that, in addition, can be inherited over many cell generations. On the other hand, we propose that PTMs play an important role in the maintenance of these transcriptional stages and are also involved in the regulation of short-term gene expression. In this view, certain PTMs enable a cell to immediately turn specific genes on or off after the cell receives appropriate stimuli, and this switch in gene expression is formally accomplished without the exchange of one H3 variant with another. Outside of gene regulation, PTMs are likely to contribute to at least two other biological processes, deposition-related PTMs (e.g., acetylation of K5/K8/K12 on H4) (63) and, possibly, the active exchange of histone variants, a mechanism about which not much is known. Taken together, we envision that the selective employment of histone H3 variants, together with their PTM signatures, regulate gene expression by barcoding the genome according to specific functions: H3.3, euchromatin; H3.2, facultative heterochromatin; H3.1, constitutive heterochromatin. However, many questions remain to be answered.

One specific question is how the H3 barcode and the histone code are connected or how different H3 variants become associated with distinct PTMs in the first place. One possibility is that the distinct H3 variants, through their ability to differentially regulate nucleosome stability, control the precise higher-order folding of chromatin that then makes these fibers suitable substrates for the appropriate modifying enzymes. For example, H3.3-containing nucleosomes may be less stable, thereby keeping chromatin fibers in a somewhat, but precise, unfolded state. These more open fibers may be the preferred substrates for activating enzymes (such as MLL/Set1, the H3 K4 HMTase). In contrast, H3.1 and H3.2 may result in generating more stable nucleosomes (in particular H3.1 through disulfide bonds with its cysteine 96) that lead to more compacted or folded chromatin fibers that are the preferred substrates for repressing enzymes (such as Ezh2, the H3 K27 HMTase). Thus, the precise chromatin structure (or fibers) these variants create and also their localization in the nuclear architecture may be, in part, the reason why they are modified in different ways with PTMs.

Consistent with the H3 barcode hypothesis, the first layer of chromatin organization (and epigenetic memory) would be dictated by the particular histone variant, whereas the potential actions of a specific modifying enzyme(s) depends, in part, on the unique structure of that chromatin fiber that the variant generates. In addition, DNA-binding transcriptional activators or repressors that recognize unique chromatin structures might recruit the appropriate enzyme(s) and, thereby, prevent inappropriate marks and create the final biological effect. In support, a subpopulation of H3.3 is phosphorylated during mitosis at its unique S31 (64). Also, nucleosomes containing H2A.Z are poor substrates for certain histone-modifying enzymes (65).

Finally, for the H3 barcode to be functional, it must have a cellular reader that interprets or scans the proposed patterns of H3 variant stripes in their entirety (66). Although such a reader(s) has yet to be identified, we suspect that PTMs, carried by the H3 variants, will hold some clues, if indeed such readers exist. We look forward to experimental tests of this hypothesis and extensions of it in the years to come

Acknowledgments

We thank all members of the Allis laboratory for insightful discussions. We especially thank E. Bernstein, A. Goldberg, C. Janzen, T. Milne, and J. Wysocka for critical review of the manuscript. Valuable input was also provided by A. Annunziato, B. Strahl, and M. Smith before the submission of this article. This work was supported by National Institutes of Health MERIT Award GM 53512 (to C.D.A.) and The Rockefeller University’s Women and Science Fellowship Program (S.B.H.).

Abbreviations

LBRlamin B receptor
PTMposttranslational modification
RDreplication-dependent
RIreplication-independent.

Conflict of interest statement: No conflicts declared.

See accompanying Profile on page 6425.

References

1. Strahl B. D., Allis C. D. Nature. 2000;403:41–45. [PubMed] [Google Scholar]
2. Fischle W., Wang Y., Allis C. D. Curr. Opin. Cell Biol. 2003;15:172–183. [PubMed] [Google Scholar]
3. Fischle W., Wang Y., Allis C. D. Nature. 2003;425:475–479. [PubMed] [Google Scholar]
4. Fischle W., Tseng B. S., Dormann H. L., Ueberheide B. M., Garcia B. A., Shabanowitz J., Hunt D. F., Funabiki H., Allis C. D. Nature. 2005;438:1116–1122. [PubMed] [Google Scholar]
5. Hirota T., Lipp J. J., Toh B. H., Peters J. M. Nature. 2005;438:1176–1180. [PubMed] [Google Scholar]
6. Schreiber S. L., Bernstein B. E. Cell. 2002;111:771–778. [PubMed] [Google Scholar]
7. Kurdistani S. K., Grunstein M. Nat. Rev. Mol. Cell Biol. 2003;4:276–284. [PubMed] [Google Scholar]
8. Iizuka M., Smith M. M. Curr. Opin. Genet. Dev. 2003;13:154–160. [PubMed] [Google Scholar]
9. Pusarla R. H., Bhargava P. FEBS J. 2005;272:5149–5168. [PubMed] [Google Scholar]
10. Isenberg I. Annu. Rev. Biochem. 1979;48:159–191. [PubMed] [Google Scholar]
11. Doenecke D., Albig W., Bode C., Drabent B., Franke K., Gavenis K., Witt O. Histochem. Cell Biol. 1997;107:1–10. [PubMed] [Google Scholar]
12. Smith M. M. Curr. Opin. Cell Biol. 2002;14:279–285. [PubMed] [Google Scholar]
13. Sullivan K. F., Hechenberger M., Masri K. J. Cell Biol. 1994;127:581–592. [PMC free article] [PubMed] [Google Scholar]
14. Avramova Z. V. Plant Physiol. 2002;129:40–49. [PMC free article] [PubMed] [Google Scholar]
15. Pidoux A., Mellone B., Allshire R. Methods. 2004;33:252–259. [PubMed] [Google Scholar]
16. Black B. E., Foltz D. R., Chakravarthy S., Luger K., Woods V. L., Jr., Cleveland D. W. Nature. 2004;430:578–582. [PubMed] [Google Scholar]
17. Ahmad K., Henikoff S. Mol. Cell. 2002;9:1191–1200. [PubMed] [Google Scholar]
18. Tagami H., Ray-Gallet D., Almouzni G., Nakatani Y. Cell. 2004;116:51–61. [PubMed] [Google Scholar]
19. McKittrick E., Gafken P. R., Ahmad K., Henikoff S. Proc. Natl. Acad. Sci. USA. 2004;101:1525–1530. [PMC free article] [PubMed] [Google Scholar]
20. Johnson L., Mollah S., Garcia B. A., Muratore T. L., Shabanowitz J., Hunt D. F., Jacobsen S. E. Nucleic Acids Res. 2004;32:6511–6518. [PMC free article] [PubMed] [Google Scholar]
21. Chow C. M., Georgiou A., Szutorisz H., Maia E. S. A., Pombo A., Barahona I., Dargelos E., Canzonetta C., Dillon N. EMBO Rep. 2005;6:354–360. [PMC free article] [PubMed] [Google Scholar]
22. Ahmad K., Henikoff S. Proc. Natl. Acad. Sci. USA. 2002;99:16477–16484. [PMC free article] [PubMed] [Google Scholar]
23. Hake S. B., Garcia B. A., Duncan E. M., Kauer M., Dellaire G., Shabanowitz J., Bazett-Jones D. P., Allis C. D., Hunt D. F. J. Biol. Chem. 2006;281:559–568. [PubMed] [Google Scholar]
24. Ramaswamy A., Bahar I., Ioshikhes I. Proteins. 2005;58:683–696. [PubMed] [Google Scholar]
25. Doolittle R. F. Trends Biochem. Sci. 1989;14:244–245. [PubMed] [Google Scholar]
26. Luger K., Mader A. W., Richmond R. K., Sargent D. F., Richmond T. J. Nature. 1997;389:251–260. [PubMed] [Google Scholar]
27. Banks D. D., Gloss L. M. Protein Sci. 2004;13:1304–1316. [PMC free article] [PubMed] [Google Scholar]
28. Camerini-Otero R. D., Felsenfeld G. Proc. Natl. Acad. Sci. USA. 1977;74:5519–5523. [PMC free article] [PubMed] [Google Scholar]
29. Chen T. A., Smith M. M., Le S. Y., Sternglanz R., Allfrey V. G. J. Biol. Chem. 1991;266:6489–6498. [PubMed] [Google Scholar]
30. Matthews J. R., Wakasugi N., Virelizier J. L., Yodoi J., Hay R. T. Nucleic Acids Res. 1992;20:3821–3830. [PMC free article] [PubMed] [Google Scholar]
31. Mitomo K., Nakayama K., Fujimoto K., Sun X., Seki S., Yamamoto K. Gene. 1994;145:197–203. [PubMed] [Google Scholar]
32. Galang C. K., Hauser C. A. Mol. Cell. Biol. 1993;13:4609–4617. [PMC free article] [PubMed] [Google Scholar]
33. Marsh W. H., Ord M. G., Stocken L. A. Biochem. J. 1964;93:539–544. [PMC free article] [PubMed] [Google Scholar]
34. Ord M. G., Stocken L. A. Biochem. J. 1966;98:888–897. [PMC free article] [PubMed] [Google Scholar]
35. Prior C. P., Cantor C. R., Johnson E. M., Littau V. C., Allfrey V. G. Cell. 1983;34:1033–1042. [PubMed] [Google Scholar]
36. Johnson E. M., Sterner R., Allfrey V. G. J. Biol. Chem. 1987;262:6943–6946. [PubMed] [Google Scholar]
37. Bryan S. E., Lambert C., Hardy K. J., Simons S. Science. 1974;186:832–833. [PubMed] [Google Scholar]
38. Rozalski M., Wierzbicki R. Biochem. Pharmacol. 1983;32:2124–2126. [PubMed] [Google Scholar]
39. Makatsori D., Kourmouli N., Polioudaki H., Shultz L. D., McLean K., Theodoropoulos P. A., Singh P. B., Georgatos S. D. J. Biol. Chem. 2004;279:25567–25573. [PubMed] [Google Scholar]
40. Polioudaki H., Kourmouli N., Drosou V., Bakou A., Theodoropoulos P. A., Singh P. B., Giannakouros T., Georgatos S. D. EMBO Rep. 2001;2:920–925. [PMC free article] [PubMed] [Google Scholar]
41. Lachner M., O’Carroll D., Rea S., Mechtler K., Jenuwein T. Nature. 2001;410:116–120. [PubMed] [Google Scholar]
42. Ringrose L., Paro R. Annu. Rev. Genet. 2004;38:413–443. [PubMed] [Google Scholar]
43. Annunziato A. T. J. Biol. Chem. 2005;280:12065–12068. [PubMed] [Google Scholar]
44. Henikoff S., Furuyama T., Ahmad K. Trends Genet. 2004;20:320–326. [PubMed] [Google Scholar]
45. McNairn A. J., Gilbert D. M. BioEssays. 2003;25:647–656. [PubMed] [Google Scholar]
46. Leffak M. Biochemistry. 1988;27:686–691. [PubMed] [Google Scholar]
47. Annunziato A. T., Seale R. L. Nucleic Acids Res. 1984;12:6179–6196. [PMC free article] [PubMed] [Google Scholar]
48. Baxevanis A. D., Godfrey J. E., Moudrianakis E. N. Biochemistry. 1991;30:8817–8823. [PubMed] [Google Scholar]
49. Loyola A., Almouzni G. Biochim. Biophys. Acta. 2004;1677:3–11. [PubMed] [Google Scholar]
50. Adams C. R., Kamakaka R. T. Curr. Opin. Genet. Dev. 1999;9:185–190. [PubMed] [Google Scholar]
51. Bruno M., Flaus A., Stockdale C., Rencurel C., Ferreira H., Owen-Hughes T. Mol. Cell. 2003;12:1599–1606. [PMC free article] [PubMed] [Google Scholar]
52. Mizuguchi G., Shen X., Landry J., Wu W. H., Sen S., Wu C. Science. 2004;303:343–348. [PubMed] [Google Scholar]
53. Singh H., Medina K. L., Pongubala J. M. Proc. Natl. Acad. Sci. USA. 2005;102:4949–4953. [PMC free article] [PubMed] [Google Scholar]
54. Workman J. L., Kingston R. E. Annu. Rev. Biochem. 1998;67:545–579. [PubMed] [Google Scholar]
55. Ng R. K., Gurdon J. B. Cell Cycle. 2005;4:760–763. [PubMed] [Google Scholar]
56. Andrulis E. D., Neiman A. M., Zappulla D. C., Sternglanz R. Nature. 1998;394:592–595. [PubMed] [Google Scholar]
57. Maison C., Bailly D., Peters A. H., Quivy J. P., Roche D., Taddei A., Lachner M., Jenuwein T., Almouzni G. Nat. Genet. 2002;30:329–334. [PubMed] [Google Scholar]
58. Rasmussen T. P. Reprod. Biol. Endocrinol. 2003;1:100. [PMC free article] [PubMed] [Google Scholar]
59. Loppin B., Bonnefoy E., Anselme C., Laurencon A., Karr T. L., Couble P. Nature. 2005;437:1386–1390. [PubMed] [Google Scholar]
60. van der Heijden G. W., Dieker J. W., Derijck A. A., Muller S., Berden J. H., Braat D. D., van der Vlag J., de Boer P. Mech. Dev. 2005;122:1008–1022. [PubMed] [Google Scholar]
61. Jin C., Felsenfeld G. Proc. Natl. Acad. Sci. USA. 2006;103:574–579. [PMC free article] [PubMed] [Google Scholar]
62. Mito Y., Henikoff J. G., Henikoff S. Nat. Genet. 2005;37:1090–1097. [PubMed] [Google Scholar]
63. Ma X. J., Wu J., Altheim B. A., Schultz M. C., Grunstein M. Proc. Natl. Acad. Sci. USA. 1998;95:6693–6698. [PMC free article] [PubMed] [Google Scholar]
64. Hake S. B., Garcia B. A., Kauer M., Baker S. P., Shabanowitz J., Hunt D. F., Allis C. D. Proc. Natl. Acad. Sci. USA. 2005;102:6344–6349. [PMC free article] [PubMed] [Google Scholar]
65. Li B., Pattenden S. G., Lee D., Gutierrez J., Chen J., Seidel C., Gerton J., Workman J. L. Proc. Natl. Acad. Sci. USA. 2005;102:18385–18390. [PMC free article] [PubMed] [Google Scholar]
66. Henikoff S. Proc. Natl. Acad. Sci. USA. 2005;102:5308–5309. [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

-