Database Commons

a catalog of biological databases

e.g., animal; RNA; Methylation; China

Database information

modENCODE (The model organism Encyclopedia of DNA Elements)

General information

Description: The modENCODE project aims to identify all sequence-based functional elements in the C. elegans and D. melanogaster genomes. modENCODE labs submit data to the Data Coordination Center (DCC) where we organize and present the results.
Year founded: 2011
Last update: 2011-08-29
Real time : Checking...
Country/Region: United States
Data type:
Data object:
Database category:
Major organism:

Contact information

University/Institution: Lawrence Berkeley National Laboratory
Address: 1 Cyclotron Road MS64-121,Berkeley,CA 94720,USA
City: Berkeley
Province/State: CA
Country/Region: United States
Contact name (PI/Team): modENCODE help
Contact email (PI/Helpdesk):

Record metadata

Created on: 2015-06-20
Curated by:
Lina Ma [2019-07-31]
Lin Liu [2016-03-27]
Guangyu Wang [2015-06-26]


All databases:
62/4549 (98.659%)
Gene genome and annotation:
29/1211 (97.688%)
Total Rank

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud


Lessons from modENCODE. [PMID: 26133010]
Brown JB, Celniker SE.

The modENCODE (Model Organism Encyclopedia of DNA Elements) Consortium aimed to map functional elements-including transcripts, chromatin marks, regulatory factor binding sites, and origins of DNA replication-in the model organisms Drosophila melanogaster and Caenorhabditis elegans. During its five-year span, the consortium conducted more than 2,000 genome-wide assays in developmentally staged animals, dissected tissues, and homogeneous cell lines. Analysis of these data sets provided foundational insights into genome, epigenome, and transcriptome structure and the evolutionary turnover of regulatory pathways. These studies facilitated a comparative analysis with similar data types produced by the ENCODE Consortium for human cells. Genome organization differs drastically in these distant species, and yet quantitative relationships among chromatin state, transcription, and cotranscriptional RNA processing are deeply conserved. Of the many biological discoveries of the modENCODE Consortium, we highlight insights that emerged from integrative studies. We focus on operational and scientific lessons that may aid future projects of similar scale or aims in other, emerging model systems.

Annu Rev Genomics Hum Genet. 2015:16() | 15 Citations (from Europe PMC, 2020-02-08)
Navigating and mining modENCODE data. [PMID: 24636835]
Boley N, Wan KH, Bickel PJ, Celniker SE.

modENCODE was a 5year NHGRI funded project (2007-2012) to map the function of every base in the genomes of worms and flies characterizing positions of modified histones and other chromatin marks, origins of DNA replication, RNA transcripts and the transcription factor binding sites that control gene expression. Here we describe the Drosophila modENCODE datasets and how best to access and use them for genome wide and individual gene studies.

Methods. 2014:68(1) | 7 Citations (from Europe PMC, 2020-02-08)
Comparative validation of the D. melanogaster modENCODE transcriptome annotation. [PMID: 24985915]
Chen ZX, Sturgill D, Qu J, Jiang H, Park S, Boley N, Suzuki AM, Fletcher AR, Plachetzki DC, FitzGerald PC, Artieri CG, Atallah J, Barmina O, Brown JB, Blankenburg KP, Clough E, Dasgupta A, Gubbala S, Han Y, Jayaseelan JC, Kalra D, Kim YA, Kovar CL, Lee SL, Li M, Malley JD, Malone JH, Mathew T, Mattiuzzo NR, Munidasa M, Muzny DM, Ongeri F, Perales L, Przytycka TM, Pu LL, Robinson G, Thornton RL, Saada N, Scherer SE, Smith HE, Vinson C, Warner CB, Worley KC, Wu YQ, Zou X, Cherbas P, Kellis M, Eisen MB, Piano F, Kionte K, Fitch DH, Sternberg PW, Cutter AD, Duff MO, Hoskins RA, Graveley BR, Gibbs RA, Bickel PJ, Kopp A, Carninci P, Celniker SE, Oliver B, Richards S.

Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.

Genome Res. 2014:24(7) | 56 Citations (from Europe PMC, 2020-02-08)
modMine: flexible access to modENCODE data. [PMID: 22080565]
Contrino S, Smith RN, Butano D, Carr A, Hu F, Lyne R, Rutherford K, Kalderimis A, Sullivan J, Carbon S, Kephart ET, Lloyd P, Stinson EO, Washington NL, Perry MD, Ruzanov P, Zha Z, Lewis SE, Stein LD, Micklem G.

In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database ( described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.

Nucleic Acids Res. 2012:40(Database issue) | 69 Citations (from Europe PMC, 2020-02-08)
The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details. [PMID: 21856757]
Washington NL, Stinson EO, Perry MD, Ruzanov P, Contrino S, Smith R, Zha Z, Lyne R, Carr A, Lloyd P, Kephart E, McKay SJ, Micklem G, Stein LD, Lewis SE.

The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at

Database (Oxford). 2011:2011() | 18 Citations (from Europe PMC, 2020-02-15)
Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. [PMID: 21177976]
Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dosé AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullier F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, Iniguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Kolasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, MacCoss M, Mackowiak SD, Mangone M, McKay S, Mecenas D, Merrihew G, Miller DM, Muroyama A, Murray JI, Ooi SL, Pham H, Phippen T, Preston EA, Rajewsky N, Rätsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan KK, Zeller G, Zha Z, Zhong M, Zhou X, modENCODE Consortium, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH.

We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.

Science. 2010:330(6012) | 529 Citations (from Europe PMC, 2020-02-08)
Identification of functional elements and regulatory circuits by Drosophila modENCODE. [PMID: 21177974]
modENCODE Consortium, Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L, Lin MF, Washietl S, Arshinoff BI, Ay F, Meyer PE, Robine N, Washington NL, Di Stefano L, Berezikov E, Brown CD, Candeias R, Carlson JW, Carr A, Jungreis I, Marbach D, Sealfon R, Tolstorukov MY, Will S, Alekseyenko AA, Artieri C, Booth BW, Brooks AN, Dai Q, Davis CA, Duff MO, Feng X, Gorchakov AA, Gu T, Henikoff JG, Kapranov P, Li R, MacAlpine HK, Malone J, Minoda A, Nordman J, Okamura K, Perry M, Powell SK, Riddle NC, Sakai A, Samsonova A, Sandler JE, Schwartz YB, Sher N, Spokony R, Sturgill D, van Baren M, Wan KH, Yang L, Yu C, Feingold E, Good P, Guyer M, Lowdon R, Ahmad K, Andrews J, Berger B, Brenner SE, Brent MR, Cherbas L, Elgin SC, Gingeras TR, Grossman R, Hoskins RA, Kaufman TC, Kent W, Kuroda MI, Orr-Weaver T, Perrimon N, Pirrotta V, Posakony JW, Ren B, Russell S, Cherbas P, Graveley BR, Lewis S, Micklem G, Oliver B, Park PJ, Celniker SE, Henikoff S, Karpen GH, Lai EC, MacAlpine DM, Stein LD, White KP, Kellis M.

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.

Science. 2010:330(6012) | 650 Citations (from Europe PMC, 2020-02-08)