Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 4;46(D1):D794-D801.
doi: 10.1093/nar/gkx1081.

The Encyclopedia of DNA elements (ENCODE): data portal update

Affiliations

The Encyclopedia of DNA elements (ENCODE): data portal update

Carrie A Davis et al. Nucleic Acids Res. .

Abstract

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center has developed the ENCODE Portal database and website as the source for the data and metadata generated by the ENCODE Consortium. Two principles have motivated the design. First, experimental protocols, analytical procedures and the data themselves should be made publicly accessible through a coherent, web-based search and download interface. Second, the same interface should serve carefully curated metadata that record the provenance of the data and justify its interpretation in biological terms. Since its initial release in 2013 and in response to recommendations from consortium members and the wider community of scientists who use the Portal to access ENCODE data, the Portal has been regularly updated to better reflect these design principles. Here we report on these updates, including results from new experiments, uniformly-processed data from other projects, new visualization tools and more comprehensive metadata to describe experiments and analyses. Additionally, the Portal is now home to meta(data) from related projects including Genomics of Gene Regulation, Roadmap Epigenome Project, Model organism ENCODE (modENCODE) and modERN. The Portal now makes available over 13000 datasets and their accompanying metadata and can be accessed at: https://www.encodeproject.org/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The Portal Landing Page. The Portal landing page is arranged to make it easy to navigate and dig into particular data types. In addition to the drop-down menu bar on top, there is a keyword search function—through which genes, cell lines and other ontological terms can be queried and datasets scored with those properties retrieved. Clickable bars allow the user to navigate to data derived from a particular organism, sample type, project and/or assay. A quick link to the Data Matrix is also available.
Figure 2.
Figure 2.
The Data Matrix. Screenshot of the Data Matrix. To the left of the matrix are faceted search bars that can be used to positively selected (blue) and negatively select (red) for certain data types on several metadata properties. The matrix itself correspond to a count (with links) to datasets done on certain sample types by certain assays. Clicking on the count—takes the user to a list of the datasets, as does clicking on the ‘list’ icon. Metadata for the data displayed in the Data Matrix and be obtained by clicking on the ‘report’ icon and downloaded in .tsv format.
Figure 3.
Figure 3.
Metadata Download. Illustrating the utility of the ‘reports’ tab to select various metadata properties for data subsets and download that into a table.
Figure 4.
Figure 4.
Audits and Badges. Ways in which the automated audits and subsequent badges that are applied to the datasets. This example, shows an experiment as you would see it in the Portal, with its list of color-coded badges. Clicking on the (+) sign, displays a drop-down menu with additional details surrounding each badge and, where available links to the standards and pipelines from which they were determined. Red = a critical issue was identified in the data, orange = a moderate issue was identified in the data, yellow = a mild issue was identified in the data. Additionally, the badge counts are shown under each experiment when displayed in the list-view. The badges have also been incorporated into the faceted search feature on the left side—to enable users to interact with the data in a quality assessed fashion.
Figure 5.
Figure 5.
Pipelines, Provenance and Quality Checks. (A). The File Association Graph for experiment (ENCSR503VTG) is shown. Files and processes specific to each replicate are shown in the green shaded area. Files and processes that are either independent of (genome assembly) or derived form a union of the replicates are shown in the white background. Yellow ovals correspond to accessioned files at the DCC and their accession numbers are all shown. When quality control is run on a given file, a green circle (B). appears and takes users to the quality metrics (C and D). Blue ovals indicate a process done to transform (map, sort, score, etc.) the individual files.
Figure 6.
Figure 6.
Genome Browsing. ‘Visualize’ link from the experiment pages that takes you to a pop-up menu from which the user can select their preferred Browser (UCSC, ENSEMBL and BioDalliance) and assembly to visualize the data in.
Figure 7.
Figure 7.
Antibody characterizations. Example of an Antibody Characterization page for antibody lot (ENCAB615WUN). Antibody lot pages are findable via the drop-down menu. For each lot, various aspects of metadata are displayed including: the Lot number, product, ID, host, targets, etc. At the bottom of the page, the user will find links to experiments done using that lot. Different antibody lots can be found by navigating the faceted search bar on the left. The classification for different lots and its eligibility status for data collection is displayed using a colored dot, the legend of which is shown. Molecular and Biochemical evidence gathered by the labs to test individual lots in various sample types is displayed as well.

Similar articles

  • Principles of metadata organization at the ENCODE data coordination center.
    Hong EL, Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT, Rowe LD, Dreszer TR, Roe GR, Podduturi NR, Tanaka F, Hilton JA, Cherry JM. Hong EL, et al. Database (Oxford). 2016 Mar 15;2016:baw001. doi: 10.1093/database/baw001. Print 2016. Database (Oxford). 2016. PMID: 26980513 Free PMC article.
  • New developments on the Encyclopedia of DNA Elements (ENCODE) data portal.
    Luo Y, Hitz BC, Gabdank I, Hilton JA, Kagda MS, Lam B, Myers Z, Sud P, Jou J, Lin K, Baymuradov UK, Graham K, Litton C, Miyasato SR, Strattan JS, Jolanki O, Lee JW, Tanaka FY, Adenekan P, O'Neill E, Cherry JM. Luo Y, et al. Nucleic Acids Res. 2020 Jan 8;48(D1):D882-D889. doi: 10.1093/nar/gkz1062. Nucleic Acids Res. 2020. PMID: 31713622 Free PMC article.
  • The ENCODE Portal as an Epigenomics Resource.
    Jou J, Gabdank I, Luo Y, Lin K, Sud P, Myers Z, Hilton JA, Kagda MS, Lam B, O'Neill E, Adenekan P, Graham K, Baymuradov UK, R Miyasato S, Strattan JS, Jolanki O, Lee JW, Litton C, Y Tanaka F, Hitz BC, Cherry JM. Jou J, et al. Curr Protoc Bioinformatics. 2019 Dec;68(1):e89. doi: 10.1002/cpbi.89. Curr Protoc Bioinformatics. 2019. PMID: 31751002 Free PMC article.
  • Lessons from modENCODE.
    Brown JB, Celniker SE. Brown JB, et al. Annu Rev Genomics Hum Genet. 2015;16:31-53. doi: 10.1146/annurev-genom-090413-025448. Epub 2015 Jun 26. Annu Rev Genomics Hum Genet. 2015. PMID: 26133010 Review.
  • Ontology application and use at the ENCODE DCC.
    Malladi VS, Erickson DT, Podduturi NR, Rowe LD, Chan ET, Davidson JM, Hitz BC, Ho M, Lee BT, Miyasato S, Roe GR, Simison M, Sloan CA, Strattan JS, Tanaka F, Kent WJ, Cherry JM, Hong EL. Malladi VS, et al. Database (Oxford). 2015 Mar 16;2015:bav010. doi: 10.1093/database/bav010. Print 2015. Database (Oxford). 2015. PMID: 25776021 Free PMC article. Review.

Cited by

References

    1. Consortium E.P., Birney E., Stamatoyannopoulos J.A., Dutta A., Guigo R., Gingeras T.R., Margulies E.H., Weng Z., Snyder M., Dermitzakis E.T. et al. . Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007; 447:799–816. - PMC - PubMed
    1. Sloan C.A., Chan E.T., Davidson J.M., Malladi V.S., Strattan J.S., Hitz B.C., Gabdank I., Narayanan A.K., Ho M., Lee B.T. et al. . ENCODE data at the ENCODE portal. Nucleic Acids Res. 2016; 44:D726–D732. - PMC - PubMed
    1. Rosenbloom K.R., Dreszer T.R., Pheasant M., Barber G.P., Meyer L.R., Pohl A., Raney B.J., Wang T., Hinrichs A.S., Zweig A.S. et al. . ENCODE whole-genome data in the UCSC genome browser. Nucleic Acids Res. 2010; 38:D620–D625. - PMC - PubMed
    1. Consortium E.P. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012; 489:57–74. - PMC - PubMed
    1. Roadmap Epigenomics Consortium Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J. et al. . Integrative analysis of 111 reference human epigenomes. Nature. 2015; 518:317–330. - PMC - PubMed

Publication types

-