Database Commons

a catalog of biological databases

e.g., animal; RNA; Methylation; China

Database Profile

General information

Full name: Encyclopedia of DNA Elements
Description: The ENCODE Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome,including elements that act at the protein and RNA levels,and regulatory elements that control cells and circumstances in which a gene is active.
Year founded: 2004
Last update: 2017-02-03
Real time : Checking...
Country/Region: United States
Data type:
Data object:
Database category:
Major organism:

Contact information

University/Institution: Stanford University
Address: Department of Genetics
City: Stanford
Province/State: CA
Country/Region: United States
Contact name (PI/Team): ENCODE help
Contact email (PI/Helpdesk):


Expanded encyclopaedias of DNA elements in the human and mouse genomes. [PMID: 32728249]
null null, Jill E Moore, Michael J Purcaro, Henry E Pratt, Charles B Epstein, Noam Shoresh, Jessika Adrian, Trupti Kawli, Carrie A Davis, Alexander Dobin, Rajinder Kaul, Jessica Halow, Eric L Van Nostrand, Peter Freese, David U Gorkin, Yin Shen, Yupeng He, Mark Mackiewicz, Florencia Pauli-Behn, Brian A Williams, Ali Mortazavi, Cheryl A Keller, Xiao-Ou Zhang, Shaimae I Elhajjajy, Jack Huey, Diane E Dickel, Valentina Snetkova, Xintao Wei, Xiaofeng Wang, Juan Carlos Rivera-Mulia, Joel Rozowsky, Jing Zhang, Surya B Chhetri, Jialing Zhang, Alec Victorsen, Kevin P White, Axel Visel, Gene W Yeo, Christopher B Burge, Eric Lécuyer, David M Gilbert, Job Dekker, John Rinn, Eric M Mendenhall, Joseph R Ecker, Manolis Kellis, Robert J Klein, William S Noble, Anshul Kundaje, Roderic Guigó, Peggy J Farnham, J Michael Cherry, Richard M Myers, Bing Ren, Brenton R Graveley, Mark B Gerstein, Len A Pennacchio, Michael P Snyder, Bradley E Bernstein, Barbara Wold, Ross C Hardison, Thomas R Gingeras, John A Stamatoyannopoulos, Zhiping Weng

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (, including phase II ENCODE and Roadmap Epigenomics data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.

Nature. 2020:583(7818) | 4 Citations (from Europe PMC, 2020-12-05)
The Encyclopedia of DNA elements (ENCODE): data portal update. [PMID: 29126249]
Davis CA, Hitz BC, Sloan CA, Chan ET, Davidson JM, Gabdank I, Hilton JA, Jain K, Baymuradov UK, Narayanan AK, Onate KC, Graham K, Miyasato SR, Dreszer TR, Strattan JS, Jolanki O, Tanaka FY, Cherry JM.

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center has developed the ENCODE Portal database and website as the source for the data and metadata generated by the ENCODE Consortium. Two principles have motivated the design. First, experimental protocols, analytical procedures and the data themselves should be made publicly accessible through a coherent, web-based search and download interface. Second, the same interface should serve carefully curated metadata that record the provenance of the data and justify its interpretation in biological terms. Since its initial release in 2013 and in response to recommendations from consortium members and the wider community of scientists who use the Portal to access ENCODE data, the Portal has been regularly updated to better reflect these design principles. Here we report on these updates, including results from new experiments, uniformly-processed data from other projects, new visualization tools and more comprehensive metadata to describe experiments and analyses. Additionally, the Portal is now home to meta(data) from related projects including Genomics of Gene Regulation, Roadmap Epigenome Project, Model organism ENCODE (modENCODE) and modERN. The Portal now makes available over 13000 datasets and their accompanying metadata and can be accessed at:

Nucleic Acids Res. 2018:46(D1) | 142 Citations (from Europe PMC, 2020-12-05)
ENCODE data at the ENCODE portal. [PMID: 26527727]
Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT, Rowe LD, Dreszer TR, Roe G, Podduturi NR, Tanaka F, Hong EL, Cherry JM.

The Encyclopedia of DNA Elements (ENCODE) Project is in its third phase of creating a comprehensive catalog of functional elements in the human genome. This phase of the project includes an expansion of assays that measure diverse RNA populations, identify proteins that interact with RNA and DNA, probe regions of DNA hypersensitivity, and measure levels of DNA methylation in a wide range of cell and tissue types to identify putative regulatory elements. To date, results for almost 5000 experiments have been released for use by the scientific community. These data are available for searching, visualization and download at the new ENCODE Portal ( The revamped ENCODE Portal provides new ways to browse and search the ENCODE data based on the metadata that describe the assays as well as summaries of the assays that focus on data provenance. In addition, it is a flexible platform that allows integration of genomic data from multiple projects. The portal experience was designed to improve access to ENCODE data by relying on metadata that allow reusability and reproducibility of the experiments. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2016:44(D1) | 144 Citations (from Europe PMC, 2020-12-05)
Principles of metadata organization at the ENCODE data coordination center. [PMID: 26980513]
Hong EL, Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT, Rowe LD, Dreszer TR, Roe GR, Podduturi NR, Tanaka F, Hilton JA, Cherry JM.

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal ( Database URL: © The Author(s) 2016. Published by Oxford University Press.

Database (Oxford). 2016:2016() | 9 Citations (from Europe PMC, 2020-12-05)
Resources for the Comprehensive Discovery of Functional RNA Elements. [PMID: 26990993]
Sundararaman B, Zhan L, Blue SM, Stanton R, Elkins K, Olson S, Wei X, Van Nostrand EL, Pratt GA, Huelga SC, Smalec BM, Wang X, Hong EL, Davidson JM, Lécuyer E, Graveley BR, Yeo GW.

Transcriptome-wide maps of RNA binding protein (RBP)-RNA interactions by immunoprecipitation (IP)-based methods such as RNA IP (RIP) and crosslinking and IP (CLIP) are key starting points for evaluating the molecular roles of the thousands of human RBPs. A significant bottleneck to the application of these methods in diverse cell lines, tissues, and developmental stages is the availability of validated IP-quality antibodies. Using IP followed by immunoblot assays, we have developed a validated repository of 438 commercially available antibodies that interrogate 365 unique RBPs. In parallel, 362 short-hairpin RNA (shRNA) constructs against 276 unique RBPs were also used to confirm specificity of these antibodies. These antibodies can characterize subcellular RBP localization. With the burgeoning interest in the roles of RBPs in cancer, neurobiology, and development, these resources are invaluable to the broad scientific community. Detailed information about these resources is publicly available at the ENCODE portal (

Mol Cell. 2016:61(6) | 45 Citations (from Europe PMC, 2020-12-05)
Using the ENCODE Resource for Functional Annotation of Genetic Variants. [PMID: 25762420]
Pazin MJ.

This article illustrates the use of the Encyclopedia of DNA Elements (ENCODE) resource to generate or refine hypotheses from genomic data on disease and other phenotypic traits. First, the goals and history of ENCODE and related epigenomics projects are reviewed. Second, the rationale for ENCODE and the major data types used by ENCODE are briefly described, as are some standard heuristics for their interpretation. Third, the use of the ENCODE resource is examined. Standard use cases for ENCODE, accessing the ENCODE resource, and accessing data from related projects are discussed. Although the focus of this article is the use of ENCODE data, some of the same approaches can be used with data from other projects.

Cold Spring Harb Protoc. 2015:2015(6) | 15 Citations (from Europe PMC, 2020-12-05)
Ontology application and use at the ENCODE DCC. [PMID: 25776021]
Malladi VS, Erickson DT, Podduturi NR, Rowe LD, Chan ET, Davidson JM, Hitz BC, Ho M, Lee BT, Miyasato S, Roe GR, Simison M, Sloan CA, Strattan JS, Tanaka F, Kent WJ, Cherry JM, Hong EL.

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory network and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All ENCODE experimental data, metadata and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage and distribution to community resources and the scientific community. As the volume of data increases, the organization of experimental details becomes increasingly complicated and demands careful curation to identify related experiments. Here, we describe the ENCODE DCC's use of ontologies to standardize experimental metadata. We discuss how ontologies, when used to annotate metadata, provide improved searching capabilities and facilitate the ability to find connections within a set of experiments. Additionally, we provide examples of how ontologies are used to annotate ENCODE metadata and how the annotations can be identified via ontology-driven searches at the ENCODE portal. As genomic datasets grow larger and more interconnected, standardization of metadata becomes increasingly vital to allow for exploration and comparison of data between different scientific projects.

Database (Oxford). 2015:2015() | 10 Citations (from Europe PMC, 2020-12-05)
ENCODE data in the UCSC Genome Browser: year 5 update. [PMID: 23193274]
Rosenbloom KR, Sloan CA, Malladi VS, Dreszer TR, Learned K, Kirkup VM, Wong MC, Maddren M, Fang R, Heitner SG, Lee BT, Barber GP, Harte RA, Diekhans M, Long JC, Wilder SP, Zweig AS, Karolchik D, Kuhn RM, Haussler D, Kent WJ.

The Encyclopedia of DNA Elements (ENCODE),, has completed its fifth year of scientific collaboration to create a comprehensive catalog of functional elements in the human genome, and its third year of investigations in the mouse genome. Since the last report in this journal, the ENCODE human data repertoire has grown by 898 new experiments (totaling 2886), accompanied by a major integrative analysis. In the mouse genome, results from 404 new experiments became available this year, increasing the total to 583, collected during the course of the project. The University of California, Santa Cruz, makes this data available on the public Genome Browser for visual browsing and data mining. Download of raw and processed data files are all supported. The ENCODE portal provides specialized tools and information about the ENCODE data sets.

Nucleic Acids Res. 2013:41(Database issue) | 450 Citations (from Europe PMC, 2020-12-05)
ENCODE whole-genome data in the UCSC Genome Browser: update 2012. [PMID: 22075998]
Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, Raney BJ, Cline MS, Karolchik D, Barber GP, Clawson H, Diekhans M, Fujita PA, Goldman M, Gravell RC, Harte RA, Hinrichs AS, Kirkup VM, Kuhn RM, Learned K, Maddren M, Meyer LR, Pohl A, Rhead B, Wong MC, Zweig AS, Haussler D, Kent WJ.

The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser ( and on an early-access Preview Browser ( Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal ( provides specialized tools and information about the ENCODE data sets.

Nucleic Acids Res. 2012:40(Database issue) | 150 Citations (from Europe PMC, 2020-12-05)
An encyclopedia of mouse DNA elements (Mouse ENCODE). [PMID: 22889292]
Mouse ENCODE Consortium, Stamatoyannopoulos JA, Snyder M, Hardison R, Ren B, Gingeras T, Gilbert DM, Groudine M, Bender M, Kaul R, Canfield T, Giste E, Johnson A, Zhang M, Balasundaram G, Byron R, Roach V, Sabo PJ, Sandstrom R, Stehling AS, Thurman RE, Weissman SM, Cayting P, Hariharan M, Lian J, Cheng Y, Landt SG, Ma Z, Wold BJ, Dekker J, Crawford GE, Keller CA, Wu W, Morrissey C, Kumar SA, Mishra T, Jain D, Byrska-Bishop M, Blankenberg D, Lajoie BR, Jain G, Sanyal A, Chen KB, Denas O, Taylor J, Blobel GA, Weiss MJ, Pimkin M, Deng W, Marinov GK, Williams BA, Fisher-Aylor KI, Desalvo G, Kiralusha A, Trout D, Amrhein H, Mortazavi A, Edsall L, McCleary D, Kuan S, Shen Y, Yue F, Ye Z, Davis CA, Zaleski C, Jha S, Xue C, Dobin A, Lin W, Fastuca M, Wang H, Guigo R, Djebali S, Lagarde J, Ryba T, Sasaki T, Malladi VS, Cline MS, Kirkup VM, Learned K, Rosenbloom KR, Kent WJ, Feingold EA, Good PJ, Pazin M, Lowdon RF, Adams LB.

To complement the human Encyclopedia of DNA Elements (ENCODE) project and to enable a broad range of mouse genomics efforts, the Mouse ENCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome.

Genome Biol. 2012:13(8) | 258 Citations (from Europe PMC, 2020-12-05)
ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. [PMID: 22955991]
Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, Chen Y, DeSalvo G, Epstein C, Fisher-Aylor KI, Euskirchen G, Gerstein M, Gertz J, Hartemink AJ, Hoffman MM, Iyer VR, Jung YL, Karmakar S, Kellis M, Kharchenko PV, Li Q, Liu T, Liu XS, Ma L, Milosavljevic A, Myers RM, Park PJ, Pazin MJ, Perry MD, Raha D, Reddy TE, Rozowsky J, Shoresh N, Sidow A, Slattery M, Stamatoyannopoulos JA, Tolstorukov MY, White KP, Xi S, Farnham PJ, Lieb JD, Wold BJ, Snyder M.

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE ( and modENCODE ( portals.

Genome Res. 2012:22(9) | 718 Citations (from Europe PMC, 2020-12-05)
An integrated encyclopedia of DNA elements in the human genome. [PMID: 22955616]
ENCODE Project Consortium.

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Nature. 2012:489(7414) | 6917 Citations (from Europe PMC, 2020-12-05)
ENCODE whole-genome data in the UCSC genome browser (2011 update). [PMID: 21037257]
Raney BJ, Cline MS, Rosenbloom KR, Dreszer TR, Learned K, Barber GP, Meyer LR, Sloan CA, Malladi VS, Roskin KM, Suh BB, Hinrichs AS, Clawson H, Zweig AS, Kirkup V, Fujita PA, Rhead B, Smith KE, Pohl A, Kuhn RM, Karolchik D, Haussler D, Kent WJ.

The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser ( ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC ( provides information about the ENCODE data and links for access.

Nucleic Acids Res. 2011:39(Database issue) | 125 Citations (from Europe PMC, 2020-12-05)
A user's guide to the encyclopedia of DNA elements (ENCODE). [PMID: 21526222]
ENCODE Project Consortium.

The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.

PLoS Biol. 2011:9(4) | 829 Citations (from Europe PMC, 2020-12-05)
ENCODE whole-genome data in the UCSC Genome Browser. [PMID: 19920125]
Rosenbloom KR, Dreszer TR, Pheasant M, Barber GP, Meyer LR, Pohl A, Raney BJ, Wang T, Hinrichs AS, Zweig AS, Fujita PA, Learned K, Rhead B, Smith KE, Kuhn RM, Karolchik D, Haussler D, Kent WJ.

The Encyclopedia of DNA Elements (ENCODE) project is an international consortium of investigators funded to analyze the human genome with the goal of producing a comprehensive catalog of functional elements. The ENCODE Data Coordination Center at The University of California, Santa Cruz (UCSC) is the primary repository for experimental results generated by ENCODE investigators. These results are captured in the UCSC Genome Bioinformatics database and download server for visualization and data mining via the UCSC Genome Browser and companion tools (Rhead et al. The UCSC Genome Browser Database: update 2010, in this issue). The ENCODE web portal at UCSC ( or provides information about the ENCODE data and convenient links for access.

Nucleic Acids Res. 2010:38(Database issue) | 168 Citations (from Europe PMC, 2020-12-05)
The ENCODE Project at UC Santa Cruz. [PMID: 17166863]
Thomas DJ, Rosenbloom KR, Clawson H, Hinrichs AS, Trumbower H, Raney BJ, Karolchik D, Barber GP, Harte RA, Hillman-Jackson J, Kuhn RM, Rhead BL, Smith KE, Thakkapallayil A, Zweig AS, ENCODE Project Consortium, Haussler D, Kent WJ.

The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website ( is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.

Nucleic Acids Res. 2007:35(Database issue) | 56 Citations (from Europe PMC, 2020-12-05)
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. [PMID: 17571346]
ENCODE Project Consortium, Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SC, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Noble WS, Dunham I, Denoeud F, Reymond A, Kapranov P, Rozowsky J, Zheng D, Castelo R, Frankish A, Harrow J, Ghosh S, Sandelin A, Hofacker IL, Baertsch R, Keefe D, Dike S, Cheng J, Hirsch HA, Sekinger EA, Lagarde J, Abril JF, Shahab A, Flamm C, Fried C, Hackermüller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, Washietl S, Korbel J, Emanuelsson O, Pedersen JS, Holroyd N, Taylor R, Swarbreck D, Matthews N, Dickson MC, Thomas DJ, Weirauch MT, Gilbert J, Drenkow J, Bell I, Zhao X, Srinivasan KG, Sung WK, Ooi HS, Chiu KP, Foissac S, Alioto T, Brent M, Pachter L, Tress ML, Valencia A, Choo SW, Choo CY, Ucla C, Manzano C, Wyss C, Cheung E, Clark TG, Brown JB, Ganesh M, Patel S, Tammana H, Chrast J, Henrichsen CN, Kai C, Kawai J, Nagalakshmi U, Wu J, Lian Z, Lian J, Newburger P, Zhang X, Bickel P, Mattick JS, Carninci P, Hayashizaki Y, Weissman S, Hubbard T, Myers RM, Rogers J, Stadler PF, Lowe TM, Wei CL, Ruan Y, Struhl K, Gerstein M, Antonarakis SE, Fu Y, Green ED, Karaöz U, Siepel A, Taylor J, Liefer LA, Wetterstrand KA, Good PJ, Feingold EA, Guyer MS, Cooper GM, Asimenos G, Dewey CN, Hou M, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Huang H, Zhang NR, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Seringhaus M, Church D, Rosenbloom K, Kent WJ, Stone EA, NISC Comparative Sequencing Program, Baylor College of Medicine Human Genome Sequencing Center, Washington University Genome Sequencing Center, Broad Institute, Children's Hospital Oakland Research Institute, Batzoglou S, Goldman N, Hardison RC, Haussler D, Miller W, Sidow A, Trinklein ND, Zhang ZD, Barrera L, Stuart R, King DC, Ameur A, Enroth S, Bieda MC, Kim J, Bhinge AA, Jiang N, Liu J, Yao F, Vega VB, Lee CW, Ng P, Shahab A, Yang A, Moqtaderi Z, Zhu Z, Xu X, Squazzo S, Oberley MJ, Inman D, Singer MA, Richmond TA, Munn KJ, Rada-Iglesias A, Wallerman O, Komorowski J, Fowler JC, Couttet P, Bruce AW, Dovey OM, Ellis PD, Langford CF, Nix DA, Euskirchen G, Hartman S, Urban AE, Kraus P, Van Calcar S, Heintzman N, Kim TH, Wang K, Qu C, Hon G, Luna R, Glass CK, Rosenfeld MG, Aldred SF, Cooper SJ, Halees A, Lin JM, Shulha HP, Zhang X, Xu M, Haidar JN, Yu Y, Ruan Y, Iyer VR, Green RD, Wadelius C, Farnham PJ, Ren B, Harte RA, Hinrichs AS, Trumbower H, Clawson H, Hillman-Jackson J, Zweig AS, Smith K, Thakkapallayil A, Barber G, Kuhn RM, Karolchik D, Armengol L, Bird CP, de Bakker PI, Kern AD, Lopez-Bigas N, Martin JD, Stranger BE, Woodroffe A, Davydov E, Dimas A, Eyras E, Hallgrímsdóttir IB, Huppert J, Zody MC, Abecasis GR, Estivill X, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VV, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Koriabine M, Nefedov M, Osoegawa K, Yoshinaga Y, Zhu B, de Jong PJ.

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

Nature. 2007:447(7146) | 2946 Citations (from Europe PMC, 2020-12-05)
The ENCODEdb portal: simplified access to ENCODE Consortium data. [PMID: 17568011]
Elnitski LL, Shah P, Moreland RT, Umayam L, Wolfsberg TG, Baxevanis AD.

The Encyclopedia of DNA Elements (ENCODE) project aims to identify and characterize all functional elements in a representative chromosomal sample comprising 1% of the human genome. Data generated by members of The ENCODE Project Consortium are housed in a number of public databases, such as the UCSC Genome Browser, NCBI's Gene Expression Omnibus (GEO), and EBI's ArrayExpress. As such, it is often difficult for biologists to gather all of the ENCODE data from a particular genomic region of interest and integrate them with relevant information found in other public databases. The ENCODEdb portal was developed to address this problem. ENCODEdb provides a unified, single point-of-access to data generated by the ENCODE Consortium, as well as to data from other source databases that lie within ENCODE regions; this provides the user a complete view of all known data in a particular region of interest. ENCODEdb Genomic Context searches allow for the retrieval of information on functional elements annotated within ENCODE regions, including mRNA, EST, and STS sequences; single nucleotide polymorphisms, and UniGene clusters. Information is also retrieved from GEO, OMIM, and major genome sequence browsers. ENCODEdb Consortium Data searches allow users to perform compound queries on array-based ENCODE data available both from GEO and from the UCSC Genome Browser. Results are retrieved from a specific genomic area of interest and can be further manipulated in a variety of contexts, including the UCSC Genome Browser and the Galaxy large-scale genome analysis platform. The ENCODEdb portal is freely accessible at

Genome Res. 2007:17(6) | 10 Citations (from Europe PMC, 2020-12-05)
The ENCODE (ENCyclopedia Of DNA Elements) Project. [PMID: 15499007]
ENCODE Project Consortium.

The ENCyclopedia Of DNA Elements (ENCODE) Project aims to identify all functional elements in the human genome sequence. The pilot phase of the Project is focused on a specified 30 megabases (approximately 1%) of the human genome sequence and is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function. The results of this pilot phase will guide future efforts to analyze the entire human genome.

Science. 2004:306(5696) | 1073 Citations (from Europe PMC, 2020-12-05)


All databases:
7/4726 (99.873%)
Gene genome and annotation:
5/1297 (99.692%)
Total Rank

Community reviews

5 Stars (1)
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Record metadata

Created on: 2015-06-20
Curated by:
Ming Chen [2020-11-10]
Lina Ma [2019-07-31]
Dong Zou [2019-07-31]
Nashaiman Pervaiz [2018-12-28]
Lina Ma [2018-06-07]
zhang yang [2018-01-28]
Shixiang Sun [2017-03-07]
Lina Ma [2016-04-08]
Jian SA [2016-04-04]
Lin Liu [2016-01-29]
Lin Liu [2016-01-15]
Jian SA [2015-12-08]
Jian SA [2015-06-27]