An expansive human regulatory lexicon encoded in transcription factor footprints
- PMID: 22955618
- PMCID: PMC3736582
- DOI: 10.1038/nature11212
An expansive human regulatory lexicon encoded in transcription factor footprints
Abstract
Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f1.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f2.gif)
![Figure 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f3.gif)
![Figure 4](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f4.gif)
![Figure 5](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f5.gif)
![Figure 6](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f6.gif)
![Figure 7](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/3736582/bin/nihms376811f7.gif)
Comment in
-
Genomics: users' guide to the human genome.Nat Rev Genet. 2012 Oct;13(10):678. doi: 10.1038/nrg3329. Epub 2012 Sep 7. Nat Rev Genet. 2012. PMID: 22955793 No abstract available.
Similar articles
-
Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements.Cell Mol Life Sci. 2024 Jun 20;81(1):274. doi: 10.1007/s00018-024-05314-z. Cell Mol Life Sci. 2024. PMID: 38902506 Review.
-
Global reference mapping of human transcription factor footprints.Nature. 2020 Jul;583(7818):729-736. doi: 10.1038/s41586-020-2528-x. Epub 2020 Jul 29. Nature. 2020. PMID: 32728250 Free PMC article.
-
Genomic footprinting.Nat Methods. 2016 Mar;13(3):213-21. doi: 10.1038/nmeth.3768. Nat Methods. 2016. PMID: 26914205 Review.
-
The accessible chromatin landscape of the human genome.Nature. 2012 Sep 6;489(7414):75-82. doi: 10.1038/nature11232. Nature. 2012. PMID: 22955617 Free PMC article.
-
An integrated encyclopedia of DNA elements in the human genome.Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247. Nature. 2012. PMID: 22955616 Free PMC article.
Cited by
-
Chromatin accessibility profiling methods.Nat Rev Methods Primers. 2021;1:10. doi: 10.1038/s43586-020-00008-9. Epub 2021 Jan 21. Nat Rev Methods Primers. 2021. PMID: 38410680 Free PMC article.
-
Logical design of synthetic cis-regulatory DNA for genetic tracing of cell identities and state changes.Nat Commun. 2024 Feb 5;15(1):897. doi: 10.1038/s41467-024-45069-6. Nat Commun. 2024. PMID: 38316783 Free PMC article.
-
Genome-wide mapping and cryo-EM structural analyses of the overlapping tri-nucleosome composed of hexasome-hexasome-octasome moieties.Commun Biol. 2024 Jan 8;7(1):61. doi: 10.1038/s42003-023-05694-1. Commun Biol. 2024. PMID: 38191828 Free PMC article.
-
A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools.Brief Bioinform. 2023 Nov 22;25(1):bbad421. doi: 10.1093/bib/bbad421. Brief Bioinform. 2023. PMID: 38040490 Free PMC article. Review.
-
Clustered and diverse transcription factor binding underlies cell type specificity of enhancers for housekeeping genes.Genome Res. 2023 Oct;33(10):1662-1672. doi: 10.1101/gr.278130.123. Epub 2023 Oct 26. Genome Res. 2023. PMID: 37884340 Free PMC article.
References
-
- Dynan WS, Tjian R. The promoter-specific transcription factor Sp1 binds to upstream sequences in the SV40 early promoter. Cell. 1983;35:79–87. - PubMed
-
- Gross DS, Garrard WT. Nuclease hypersensitive sites in chromatin. Annu. Rev. Biochem. 1988;57:159–197. - PubMed
-
- Thanos D, Maniatis T. Virus induction of human IFN beta gene expression requires the assembly of an enhanceosome. Cell. 1995;83:1091–1100. - PubMed
Publication types
MeSH terms
Substances
Associated data
- Actions
- Actions
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources