Database Commons

a catalog of biological databases

e.g., animal; RNA; Methylation; China

Database information

Ensembl (Ensembl)

General information

Description: Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.
Year founded: 1999
Last update: 2019-4
Version: v96
Accessibility:
Manual:
Accessible
Real time : Checking...
Country/Region: United Kingdom
Data type:
Data object:
Database category:
Major organism:
Keywords:

Contact information

University/Institution: European Bioinformatics Institute
Address: Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
City: Cambridge
Province/State: Cambridgeshire
Country/Region: United Kingdom
Contact name (PI/Team): Paul Flicek
Contact email (PI/Helpdesk): flicek@ebi.ac.uk

Record metadata

Created on: 2015-06-20
Curated by:
Lina Ma [2019-04-18]
Lina Ma [2019-04-16]
Dong Zou [2019-01-04]
[2018-11-27]
Lina Ma [2018-06-07]
Dong Zou [2018-02-07]
Shixiang Sun [2017-02-17]
Zhang Zhang [2016-05-25]
Mengwei Li [2016-02-21]
Zhang Zhang [2016-01-18]
Lina Ma [2016-01-18]
Lin Liu [2016-01-18]
Lin Liu [2016-01-12]
Zhang Zhang [2015-12-31]
Jian SA [2015-12-11]
Jian SA [2015-06-27]

Ranking

All databases:
16/4499 (99.667%)
Gene genome and annotation:
11/1199 (99.166%)
16
Total Rank
7,902
Citations
464.824
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Publications

30407521
Ensembl 2019. [PMID: 30407521]
Fiona Cunningham, Premanand Achuthan, Wasiu Akanni, James Allen, M Ridwan Amode, Irina M Armean, Ruth Bennett, Jyothish Bhai, Konstantinos Billis, Sanjay Boddu, Carla Cummins, Claire Davidson, Kamalkumar Jayantilal Dodiya, Astrid Gall, Carlos García Girón, Laurent Gil, Tiago Grego, Leanne Haggerty, Erin Haskell, Thibaut Hourlier, Osagie G Izuogu, Sophie H Janacek, Thomas Juettemann, Mike Kay, Matthew R Laird, Ilias Lavidas, Zhicheng Liu, Jane E Loveland, José C Marugán, Thomas Maurel, Aoife C McMahon, Benjamin Moore, Joannella Morales, Jonathan M Mudge, Michael Nuhn, Denye Ogeh, Anne Parker, Andrew Parton, Mateus Patricio, Ahamed Imran Abdul Salam, Bianca M Schmitt, Helen Schuilenburg, Dan Sheppard, Helen Sparrow, Eloise Stapleton, Marek Szuba, Kieron Taylor, Glen Threadgold, Anja Thormann, Alessandro Vullo, Brandon Walts, Andrea Winterbottom, Amonida Zadissa, Marc Chakiachvili, Adam Frankish, Sarah E Hunt, Myrto Kostadima, Nick Langridge, Fergal J Martin, Matthieu Muffato, Emily Perry, Magali Ruffier, Daniel M Staines, Stephen J Trevanion, Bronwen L Aken, Andrew D Yates, Daniel R Zerbino, Paul Flicek

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.

Nucleic Acids Res. 2019:47(D1) | 24 Citations (from Europe PMC, 2019-12-14)
29155950
Ensembl 2018. [PMID: 29155950]
Daniel R Zerbino, Premanand Achuthan, Wasiu Akanni, M Ridwan Amode, Daniel Barrell, Jyothish Bhai, Konstantinos Billis, Carla Cummins, Astrid Gall, Carlos García Girón, Laurent Gil, Leo Gordon, Leanne Haggerty, Erin Haskell, Thibaut Hourlier, Osagie G Izuogu, Sophie H Janacek, Thomas Juettemann, Jimmy Kiang To, Matthew R Laird, Ilias Lavidas, Zhicheng Liu, Jane E Loveland, Thomas Maurel, William McLaren, Benjamin Moore, Jonathan Mudge, Daniel N Murphy, Victoria Newman, Michael Nuhn, Denye Ogeh, Chuang Kee Ong, Anne Parker, Mateus Patricio, Harpreet Singh Riat, Helen Schuilenburg, Dan Sheppard, Helen Sparrow, Kieron Taylor, Anja Thormann, Alessandro Vullo, Brandon Walts, Amonida Zadissa, Adam Frankish, Sarah E Hunt, Myrto Kostadima, Nicholas Langridge, Fergal J Martin, Matthieu Muffato, Emily Perry, Magali Ruffier, Dan M Staines, Stephen J Trevanion, Bronwen L Aken, Fiona Cunningham, Andrew Yates, Paul Flicek

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.

Nucleic Acids Res. 2018:46(D1) | 415 Citations (from Europe PMC, 2019-12-14)
27899575
Ensembl 2017. [PMID: 27899575]
Bronwen L Aken, Premanand Achuthan, Wasiu Akanni, M Ridwan Amode, Friederike Bernsdorff, Jyothish Bhai, Konstantinos Billis, Denise Carvalho-Silva, Carla Cummins, Peter Clapham, Laurent Gil, Carlos García Girón, Leo Gordon, Thibaut Hourlier, Sarah E Hunt, Sophie H Janacek, Thomas Juettemann, Stephen Keenan, Matthew R Laird, Ilias Lavidas, Thomas Maurel, William McLaren, Benjamin Moore, Daniel N Murphy, Rishi Nag, Victoria Newman, Michael Nuhn, Chuang Kee Ong, Anne Parker, Mateus Patricio, Harpreet Singh Riat, Daniel Sheppard, Helen Sparrow, Kieron Taylor, Anja Thormann, Alessandro Vullo, Brandon Walts, Steven P Wilder, Amonida Zadissa, Myrto Kostadima, Fergal J Martin, Matthieu Muffato, Emily Perry, Magali Ruffier, Daniel M Staines, Stephen J Trevanion, Fiona Cunningham, Andrew Yates, Daniel R Zerbino, Paul Flicek

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2017:45(D1) | 203 Citations (from Europe PMC, 2019-12-14)
26687719
Ensembl 2016. [PMID: 26687719]
Andrew Yates, Wasiu Akanni, M Ridwan Amode, Daniel Barrell, Konstantinos Billis, Denise Carvalho-Silva, Carla Cummins, Peter Clapham, Stephen Fitzgerald, Laurent Gil, Carlos García Girón, Leo Gordon, Thibaut Hourlier, Sarah E Hunt, Sophie H Janacek, Nathan Johnson, Thomas Juettemann, Stephen Keenan, Ilias Lavidas, Fergal J Martin, Thomas Maurel, William McLaren, Daniel N Murphy, Rishi Nag, Michael Nuhn, Anne Parker, Mateus Patricio, Miguel Pignatelli, Matthew Rahtz, Harpreet Singh Riat, Daniel Sheppard, Kieron Taylor, Anja Thormann, Alessandro Vullo, Steven P Wilder, Amonida Zadissa, Ewan Birney, Jennifer Harrow, Matthieu Muffato, Emily Perry, Magali Ruffier, Giulietta Spudich, Stephen J Trevanion, Fiona Cunningham, Bronwen L Aken, Daniel R Zerbino, Paul Flicek

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2016:44(D1) | 677 Citations (from Europe PMC, 2019-12-14)
26888907
Ensembl regulation resources. [PMID: 26888907]
Zerbino DR, Johnson N, Juetteman T, Sheppard D, Wilder SP, Lavidas I, Nuhn M, Perry E, Raffaillac-Desfosses Q, Sobral D, Keefe D, Gräf S, Ahmed I, Kinsella R, Pritchard B, Brent S, Amode R, Parker A, Trevanion S, Birney E, Dunham I, Flicek P.

New experimental techniques in epigenomics allow researchers to assay a diversity of highly dynamic features such as histone marks, DNA modifications or chromatin structure. The study of their fluctuations should provide insights into gene expression regulation, cell differentiation and disease. The Ensembl project collects and maintains the Ensembl regulation data resources on epigenetic marks, transcription factor binding and DNA methylation for human and mouse, as well as microarray probe mappings and annotations for a variety of chordate genomes. From this data, we produce a functional annotation of the regulatory elements along the human and mouse genomes with plans to expand to other species as data becomes available. Starting from well-studied cell lines, we will progressively expand our library of measurements to a greater variety of samples. Ensembl's regulation resources provide a central and easy-to-query repository for reference epigenomes. As with all Ensembl data, it is freely available at http://www.ensembl.org, from the Perl and REST APIs and from the public Ensembl MySQL database server at ensembldb.ensembl.org. Database URL: http://www.ensembl.org.

Database (Oxford). 2016:2016() | 13 Citations (from Europe PMC, 2019-12-14)
25352552
Ensembl 2015. [PMID: 25352552]
Fiona Cunningham, M Ridwan Amode, Daniel Barrell, Kathryn Beal, Konstantinos Billis, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Stephen Fitzgerald, Laurent Gil, Carlos García Girón, Leo Gordon, Thibaut Hourlier, Sarah E Hunt, Sophie H Janacek, Nathan Johnson, Thomas Juettemann, Andreas K Kähäri, Stephen Keenan, Fergal J Martin, Thomas Maurel, William McLaren, Daniel N Murphy, Rishi Nag, Bert Overduin, Anne Parker, Mateus Patricio, Emily Perry, Miguel Pignatelli, Harpreet Singh Riat, Daniel Sheppard, Kieron Taylor, Anja Thormann, Alessandro Vullo, Steven P Wilder, Amonida Zadissa, Bronwen L Aken, Ewan Birney, Jennifer Harrow, Rhoda Kinsella, Matthieu Muffato, Magali Ruffier, Stephen M J Searle, Giulietta Spudich, Stephen J Trevanion, Andy Yates, Daniel R Zerbino, Paul Flicek

Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2015:43(Database issue) | 723 Citations (from Europe PMC, 2019-12-14)
24316576
Ensembl 2014. [PMID: 24316576]
Paul Flicek, M Ridwan Amode, Daniel Barrell, Kathryn Beal, Konstantinos Billis, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Stephen Fitzgerald, Laurent Gil, Carlos García Girón, Leo Gordon, Thibaut Hourlier, Sarah Hunt, Nathan Johnson, Thomas Juettemann, Andreas K Kähäri, Stephen Keenan, Eugene Kulesha, Fergal J Martin, Thomas Maurel, William M McLaren, Daniel N Murphy, Rishi Nag, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Emily Pritchard, Harpreet S Riat, Magali Ruffier, Daniel Sheppard, Kieron Taylor, Anja Thormann, Stephen J Trevanion, Alessandro Vullo, Steven P Wilder, Mark Wilson, Amonida Zadissa, Bronwen L Aken, Ewan Birney, Fiona Cunningham, Jennifer Harrow, Javier Herrero, Tim J P Hubbard, Rhoda Kinsella, Matthieu Muffato, Anne Parker, Giulietta Spudich, Andy Yates, Daniel R Zerbino, Stephen M J Searle

Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms and farm animals. Over the past year we have increased the number of species that we support to 77 and expanded our genome browser with a new scrollable overview and improved variation and phenotype views. We also report updates to our core datasets and improvements to our gene homology relationships from the addition of new species. Our REST service has been extended with additional support for comparative genomics and ontology information. Finally, we provide updated information about our methods for data access and resources for user training.

Nucleic Acids Res. 2014:42(Database issue) | 800 Citations (from Europe PMC, 2019-12-14)
23203987
Ensembl 2013. [PMID: 23203987]
Paul Flicek, Ikhlak Ahmed, M Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Laurent Gil, Carlos García-Girón, Leo Gordon, Thibaut Hourlier, Sarah Hunt, Thomas Juettemann, Andreas K Kähäri, Stephen Keenan, Monika Komorowska, Eugene Kulesha, Ian Longden, Thomas Maurel, William M McLaren, Matthieu Muffato, Rishi Nag, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Emily Pritchard, Harpreet Singh Riat, Graham R S Ritchie, Magali Ruffier, Michael Schuster, Daniel Sheppard, Daniel Sobral, Kieron Taylor, Anja Thormann, Stephen Trevanion, Simon White, Steven P Wilder, Bronwen L Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Jennifer Harrow, Javier Herrero, Tim J P Hubbard, Nathan Johnson, Rhoda Kinsella, Anne Parker, Giulietta Spudich, Andy Yates, Amonida Zadissa, Stephen M J Searle

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.

Nucleic Acids Res. 2013:41(Database issue) | 648 Citations (from Europe PMC, 2019-12-14)
22086963
Ensembl 2012. [PMID: 22086963]
Paul Flicek, M Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Laurent Gil, Leo Gordon, Maurice Hendrix, Thibaut Hourlier, Nathan Johnson, Andreas K Kähäri, Damian Keefe, Stephen Keenan, Rhoda Kinsella, Monika Komorowska, Gautier Koscielny, Eugene Kulesha, Pontus Larsson, Ian Longden, William McLaren, Matthieu Muffato, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Harpreet Singh Riat, Graham R S Ritchie, Magali Ruffier, Michael Schuster, Daniel Sobral, Y Amy Tang, Kieron Taylor, Stephen Trevanion, Jana Vandrovcova, Simon White, Mark Wilson, Steven P Wilder, Bronwen L Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Richard Durbin, Xosé M Fernández-Suarez, Jennifer Harrow, Javier Herrero, Tim J P Hubbard, Anne Parker, Glenn Proctor, Giulietta Spudich, Jan Vogel, Andy Yates, Amonida Zadissa, Stephen M J Searle

The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.

Nucleic Acids Res. 2012:40(Database issue) | 631 Citations (from Europe PMC, 2019-12-14)
21045057
Ensembl 2011. [PMID: 21045057]
Paul Flicek, M Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Yuan Chen, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Leo Gordon, Maurice Hendrix, Thibaut Hourlier, Nathan Johnson, Andreas Kähäri, Damian Keefe, Stephen Keenan, Rhoda Kinsella, Felix Kokocinski, Eugene Kulesha, Pontus Larsson, Ian Longden, William McLaren, Bert Overduin, Bethan Pritchard, Harpreet Singh Riat, Daniel Rios, Graham R S Ritchie, Magali Ruffier, Michael Schuster, Daniel Sobral, Giulietta Spudich, Y Amy Tang, Stephen Trevanion, Jana Vandrovcova, Albert J Vilella, Simon White, Steven P Wilder, Amonida Zadissa, Jorge Zamora, Bronwen L Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Richard Durbin, Xosé M Fernández-Suarez, Javier Herrero, Tim J P Hubbard, Anne Parker, Glenn Proctor, Jan Vogel, Stephen M J Searle

The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.

Nucleic Acids Res. 2011:39(Database issue) | 468 Citations (from Europe PMC, 2019-12-14)
19906699
Ensembl's 10th year. [PMID: 19906699]
Paul Flicek, Bronwen L Aken, Benoit Ballester, Kathryn Beal, Eugene Bragin, Simon Brent, Yuan Chen, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Julio Fernandez-Banet, Leo Gordon, Stefan Gräf, Syed Haider, Martin Hammond, Kerstin Howe, Andrew Jenkinson, Nathan Johnson, Andreas Kähäri, Damian Keefe, Stephen Keenan, Rhoda Kinsella, Felix Kokocinski, Gautier Koscielny, Eugene Kulesha, Daniel Lawson, Ian Longden, Tim Massingham, William McLaren, Karine Megy, Bert Overduin, Bethan Pritchard, Daniel Rios, Magali Ruffier, Michael Schuster, Guy Slater, Damian Smedley, Giulietta Spudich, Y Amy Tang, Stephen Trevanion, Albert Vilella, Jan Vogel, Simon White, Steven P Wilder, Amonida Zadissa, Ewan Birney, Fiona Cunningham, Ian Dunham, Richard Durbin, Xosé M Fernández-Suarez, Javier Herrero, Tim J P Hubbard, Anne Parker, Glenn Proctor, James Smith, Stephen M J Searle

Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.

Nucleic Acids Res. 2010:38(Database issue) | 193 Citations (from Europe PMC, 2019-12-14)
20459810
A database and API for variation, dense genotyping and resequencing data. [PMID: 20459810]
Daniel Rios, William M McLaren, Yuan Chen, Ewan Birney, Arne Stabenau, Paul Flicek, Fiona Cunningham,

BACKGROUND: Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources.
RESULTS: Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments.
CONCLUSIONS: Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at http://www.ensembl.org.

BMC Bioinformatics. 2010:11() | 24 Citations (from Europe PMC, 2019-12-14)
20459805
Ensembl variation resources. [PMID: 20459805]
Yuan Chen, Fiona Cunningham, Daniel Rios, William M McLaren, James Smith, Bethan Pritchard, Giulietta M Spudich, Simon Brent, Eugene Kulesha, Pablo Marin-Garcia, Damian Smedley, Ewan Birney, Paul Flicek,

BACKGROUND: The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics.
DESCRIPTION: The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl.
CONCLUSIONS: Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.

BMC Genomics. 2010:11() | 58 Citations (from Europe PMC, 2019-12-14)
20459808
Touring Ensembl: a practical guide to genome browsing. [PMID: 20459808]
Spudich GM, Fernández-Suárez XM.

The number of databases in molecular biological fields has rapidly increased to provide a large-scale resource. Though valuable information is available, data can be difficult to access, compare and integrate due to different formats and presentations of web interfaces. This paper offers a practical guide to the integration of gene, comparative genomic, and functional genomics data using the Ensembl website at http://www.ensembl.org.The Ensembl genome browser and underlying databases focus on chordate organisms. More species such as plants and microorganisms can be investigated using our sister browser at http://www.ensemblgenomes.org.In this study, four examples are used that sample many pages and features of the Ensembl browser. We focus on comparative studies across over 50 mostly chordate organisms, variations linked to disease, functional genomics, and access of external information housed in databases outside the Ensembl project. Researchers will learn how to go beyond simply exporting one gene sequence, and explore how a genome browser can integrate data from various sources and databases to build a full and comprehensive biological picture.

BMC Genomics. 2010:11() | 16 Citations (from Europe PMC, 2019-12-14)
19033362
Ensembl 2009. [PMID: 19033362]
T J P Hubbard, B L Aken, S Ayling, B Ballester, K Beal, E Bragin, S Brent, Y Chen, P Clapham, L Clarke, G Coates, S Fairley, S Fitzgerald, J Fernandez-Banet, L Gordon, S Graf, S Haider, M Hammond, R Holland, K Howe, A Jenkinson, N Johnson, A Kahari, D Keefe, S Keenan, R Kinsella, F Kokocinski, E Kulesha, D Lawson, I Longden, K Megy, P Meidl, B Overduin, A Parker, B Pritchard, D Rios, M Schuster, G Slater, D Smedley, W Spooner, G Spudich, S Trevanion, A Vilella, J Vogel, S White, S Wilder, A Zadissa, E Birney, F Cunningham, V Curwen, R Durbin, X M Fernandez-Suarez, J Herrero, A Kasprzyk, G Proctor, J Smith, S Searle, P Flicek

The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases, and other information for chordate, selected model organism and disease vector genomes. As of release 51 (November 2008), Ensembl fully supports 45 species, and three additional species have preliminary support. New species in the past year include orangutan and six additional low coverage mammalian genomes. Major additions and improvements to Ensembl since our previous report include a major redesign of our website; generation of multiple genome alignments and ancestral sequences using the new Enredo-Pecan-Ortheus pipeline and development of our software infrastructure, particularly to support the Ensembl Genomes project (http://www.ensemblgenomes.org/).

Nucleic Acids Res. 2009:37(Database issue) | 561 Citations (from Europe PMC, 2019-12-14)
18000006
Ensembl 2008. [PMID: 18000006]
P Flicek, B L Aken, K Beal, B Ballester, M Caccamo, Y Chen, L Clarke, G Coates, F Cunningham, T Cutts, T Down, S C Dyer, T Eyre, S Fitzgerald, J Fernandez-Banet, S Gräf, S Haider, M Hammond, R Holland, K L Howe, K Howe, N Johnson, A Jenkinson, A Kähäri, D Keefe, F Kokocinski, E Kulesha, D Lawson, I Longden, K Megy, P Meidl, B Overduin, A Parker, B Pritchard, A Prlic, S Rice, D Rios, M Schuster, I Sealy, G Slater, D Smedley, G Spudich, S Trevanion, A J Vilella, J Vogel, S White, M Wood, E Birney, T Cox, V Curwen, R Durbin, X M Fernandez-Suarez, J Herrero, T J P Hubbard, A Kasprzyk, G Proctor, J Smith, A Ureta-Vidal, S Searle

The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes. As of release 47 (October 2007), Ensembl fully supports 35 species, with preliminary support for six additional species. New species in the past year include platypus and horse. Major additions and improvements to Ensembl since our previous report include extensive support for functional genomics data in the form of a specialized functional genomics database, genome-wide maps of protein-DNA interactions and the Ensembl regulatory build; support for customization of the Ensembl web interface through the addition of user accounts and user groups; and increased support for genome resequencing. We have also introduced new comparative genomics-based data mining options and report on the continued development of our software infrastructure.

Nucleic Acids Res. 2008:36(Database issue) | 336 Citations (from Europe PMC, 2019-12-14)
17148474
Ensembl 2007. [PMID: 17148474]
T J P Hubbard, B L Aken, K Beal, B Ballester, M Caccamo, Y Chen, L Clarke, G Coates, F Cunningham, T Cutts, T Down, S C Dyer, S Fitzgerald, J Fernandez-Banet, S Graf, S Haider, M Hammond, J Herrero, R Holland, K Howe, K Howe, N Johnson, A Kahari, D Keefe, F Kokocinski, E Kulesha, D Lawson, I Longden, C Melsopp, K Megy, P Meidl, B Ouverdin, A Parker, A Prlic, S Rice, D Rios, M Schuster, I Sealy, J Severin, G Slater, D Smedley, G Spudich, S Trevanion, A Vilella, J Vogel, S White, M Wood, T Cox, V Curwen, R Durbin, X M Fernandez-Suarez, P Flicek, A Kasprzyk, G Proctor, S Searle, J Smith, A Ureta-Vidal, E Birney

The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.

Nucleic Acids Res. 2007:35(Database issue) | 521 Citations (from Europe PMC, 2019-12-14)
16381931
Ensembl 2006. [PMID: 16381931]
E Birney, D Andrews, M Caccamo, Y Chen, L Clarke, G Coates, T Cox, F Cunningham, V Curwen, T Cutts, T Down, R Durbin, X M Fernandez-Suarez, P Flicek, S Gräf, M Hammond, J Herrero, K Howe, V Iyer, K Jekosch, A Kähäri, A Kasprzyk, D Keefe, F Kokocinski, E Kulesha, D London, I Longden, C Melsopp, P Meidl, B Overduin, A Parker, G Proctor, A Prlic, M Rae, D Rios, S Redmond, M Schuster, I Sealy, S Searle, J Severin, G Slater, D Smedley, J Smith, A Stabenau, J Stalker, S Trevanion, A Ureta-Vidal, J Vogel, S White, C Woodwark, T J P Hubbard

The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data.

Nucleic Acids Res. 2006:34(Database issue) | 257 Citations (from Europe PMC, 2019-12-14)
15608235
Ensembl 2005. [PMID: 15608235]
T Hubbard, D Andrews, M Caccamo, G Cameron, Y Chen, M Clamp, L Clarke, G Coates, T Cox, F Cunningham, V Curwen, T Cutts, T Down, R Durbin, X M Fernandez-Suarez, J Gilbert, M Hammond, J Herrero, H Hotz, K Howe, V Iyer, K Jekosch, A Kahari, A Kasprzyk, D Keefe, S Keenan, F Kokocinsci, D London, I Longden, G McVicker, C Melsopp, P Meidl, S Potter, G Proctor, M Rae, D Rios, M Schuster, S Searle, J Severin, G Slater, D Smedley, J Smith, W Spooner, A Stabenau, J Stalker, R Storey, S Trevanion, A Ureta-Vidal, J Vogel, S White, C Woodwark, E Birney

The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased by 7 to 16, with the addition of the six vertebrate genomes of chimpanzee, dog, cow, chicken, tetraodon and frog and the insect genome of honeybee. The majority have been annotated automatically using the Ensembl gene build system, showing its flexibility to reliably annotate a wide variety of genomes. With the increased number of vertebrate genomes, the comparative analysis provided to users has been greatly improved, with new website interfaces allowing annotation of different genomes to be directly compared. The Ensembl software system is being increasingly widely reused in different projects showing the benefits of a completely open approach to software development and distribution.

Nucleic Acids Res. 2005:33(Database issue) | 271 Citations (from Europe PMC, 2019-12-14)
14681459
Ensembl 2004. [PMID: 14681459]
E Birney, D Andrews, P Bevan, M Caccamo, G Cameron, Y Chen, L Clarke, G Coates, T Cox, J Cuff, V Curwen, T Cutts, T Down, R Durbin, E Eyras, X M Fernandez-Suarez, P Gane, B Gibbins, J Gilbert, M Hammond, H Hotz, V Iyer, A Kahari, K Jekosch, A Kasprzyk, D Keefe, S Keenan, H Lehvaslaiho, G McVicker, C Melsopp, P Meidl, E Mongin, R Pettett, S Potter, G Proctor, M Rae, S Searle, G Slater, D Smedley, J Smith, W Spooner, A Stabenau, J Stalker, R Storey, A Ureta-Vidal, C Woodwark, M Clamp, T Hubbard

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is a comprehensive and integrated source of annotation of large genome sequences, available via interactive website, web services or flat files. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. The facilities of the system range from sequence analysis to data storage and visualization and installations exist around the world both in companies and at academic sites. With a total of nine genome sequences available from Ensembl and more genomes to follow, recent developments have focused mainly on closer integration between genomes and external data.

Nucleic Acids Res. 2004:32(Database issue) | 121 Citations (from Europe PMC, 2019-12-14)
15123591
The Ensembl Web site: mechanics of a genome browser. [PMID: 15123591]
James Stalker, Brian Gibbins, Patrick Meidl, James Smith, William Spooner, Hans-Rudolf Hotz, Antony V Cox,

The Ensembl Web site (http://www.ensembl.org/) is the principal user interface to the data of the Ensembl project, and currently serves >500,000 pages (approximately 2.5 million hits) per week, providing access to >80 GB (gigabyte) of data to users in more than 80 countries. Built atop an open-source platform comprising Apache/mod_perl and the MySQL relational database management system, it is modular, extensible, and freely available. It is being actively reused and extended in several different projects, and has been downloaded and installed in companies and academic institutions worldwide. Here, we describe some of the technical features of the site, with particular reference to its dynamic configuration that enables it to handle disparate data from multiple species.

Genome Res. 2004:14(5) | 59 Citations (from Europe PMC, 2019-12-14)
15338620
Ensembl: a genome infrastructure. [PMID: 15338620]
E Birney, Ensembl Team.
Cold Spring Harb Symp Quant Biol. 2003:68() | 5 Citations (from Europe PMC, 2019-12-14)
12519943
Ensembl 2002: accommodating comparative genomics. [PMID: 12519943]
M Clamp, D Andrews, D Barker, P Bevan, G Cameron, Y Chen, L Clark, T Cox, J Cuff, V Curwen, T Down, R Durbin, E Eyras, J Gilbert, M Hammond, T Hubbard, A Kasprzyk, D Keefe, H Lehvaslaiho, V Iyer, C Melsopp, E Mongin, R Pettett, S Potter, A Rust, E Schmidt, S Searle, G Slater, J Smith, W Spooner, A Stabenau, J Stalker, E Stupka, A Ureta-Vidal, I Vastrik, E Birney

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.

Nucleic Acids Res. 2003:31(1) | 145 Citations (from Europe PMC, 2019-12-14)
11752248
The Ensembl genome database project. [PMID: 11752248]
T Hubbard, D Barker, E Birney, G Cameron, Y Chen, L Clark, T Cox, J Cuff, V Curwen, T Down, R Durbin, E Eyras, J Gilbert, M Hammond, L Huminiecki, A Kasprzyk, H Lehvaslaiho, P Lijnzaad, C Melsopp, E Mongin, R Pettett, M Pocock, S Potter, A Rust, E Schmidt, S Searle, G Slater, J Smith, W Spooner, A Stabenau, J Stalker, E Stupka, A Ureta-Vidal, I Vastrik, M Clamp

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

Nucleic Acids Res. 2002:30(1) | 733 Citations (from Europe PMC, 2019-12-14)