Database Commons

a catalog of biological databases

e.g., animal; RNA; Methylation; China

Database information


General information

Description: EBI Metagenomics is a new resource for the analysis and archiving of metagenomic data. It allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive
Year founded: 2013
Last update: 2018
Version: 4.1
Real time : Checking...
Country/Region: United Kingdom
Data type:
Data object:
Database category:
Major organism:

Contact information

University/Institution: European Bioinformatics Institute
Address: European Molecular Biology Laboratory,European Bioinformatics Institute (EMBL-EBI),Wellcome Trust Genome Campus,Hinxton,CB10 1SD,UK
City: Cambridge
Country/Region: United Kingdom
Contact name (PI/Team): Sarah Hunter
Contact email (PI/Helpdesk):

Record metadata

Created on: 2015-06-20
Curated by:
Lina Ma [2019-04-17]
huma shireen [2018-08-28]
Lina Ma [2018-06-12]
Jian SA [2016-04-04]
Mengwei Li [2016-02-21]
Lin Liu [2016-01-29]
Lin Liu [2016-01-05]
Jian SA [2015-12-07]
Jian SA [2015-06-28]
Jian SA [2015-06-27]


All databases:
291/4549 (93.625%)
Raw bio-data:
20/451 (95.787%)
Gene genome and annotation:
117/1211 (90.421%)
Total Rank

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud


A new genomic blueprint of the human gut microbiota. [PMID: 30745586]
Almeida A, Mitchell AL, Boland M, Forster SC, Gloor GB, Tarkowska A, Lawley TD, Finn RD.

The composition of the human gut microbiota is linked to health and disease, but knowledge of individual microbial species is needed to decipher their biological roles. Despite extensive culturing and sequencing efforts, the complete bacterial repertoire of the human gut microbiota remains undefined. Here we identify 1,952 uncultured candidate bacterial species by reconstructing 92,143 metagenome-assembled genomes from 11,850 human gut microbiomes. These uncultured genomes substantially expand the known species repertoire of the collective human gut microbiota, with a 281% increase in phylogenetic diversity. Although the newly identified species are less prevalent in well-studied populations compared to reference isolate genomes, they improve classification of understudied African and South American samples by more than 200%. These candidate species encode hundreds of newly identified biosynthetic gene clusters and possess a distinctive functional capacity that might explain their elusive nature. Our work expands the known diversity of uncultured gut bacteria, which provides unprecedented resolution for taxonomic and functional characterization of the intestinal microbiota.

Nature. 2019:568(7753) | 20 Citations (from Europe PMC, 2020-02-08)
EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. [PMID: 29069476]
Mitchell AL, Scheremetjew M, Denise H, Potter S, Tarkowska A, Qureshi M, Salazar GA, Pesseat S, Boland MA, Hunter FMI, Ten Hoopen P, Alako B, Amid C, Wilkinson DJ, Curtis TP, Cochrane G, Finn RD.

EBI metagenomics ( provides a free to use platform for the analysis and archiving of sequence data derived from the microbial populations found in a particular environment. Over the past two years, EBI metagenomics has increased the number of datasets analysed 10-fold. In addition to increased throughput, the underlying analysis pipeline has been overhauled to include both new or updated tools and reference databases. Of particular note is a new workflow for taxonomic assignments that has been extended to include assignments based on both the large and small subunit RNA marker genes and to encompass all cellular micro-organisms. We also describe the addition of metagenomic assembly as a new analysis service. Our pilot studies have produced over 2400 assemblies from datasets in the public domain. From these assemblies, we have produced a searchable, non-redundant protein database of over 50 million sequences. To provide improved access to the data stored within the resource, we have developed a programmatic interface that provides access to the analysis results and associated sample metadata. Finally, we have integrated the results of a series of statistical analyses that provide estimations of diversity and sample comparisons.

Nucleic Acids Res. 2018:46(D1) | 32 Citations (from Europe PMC, 2020-02-08)
Benchmarking taxonomic assignments based on 16S rRNA gene profiling of the microbiota from commonly sampled environments. [PMID: 29762668]
Almeida A, Mitchell AL, Tarkowska A, Finn RD.

Background:Taxonomic profiling of ribosomal RNA (rRNA) sequences has been the accepted norm for inferring the composition of complex microbial ecosystems. Quantitative Insights Into Microbial Ecology (QIIME) and mothur have been the most widely used taxonomic analysis tools for this purpose, with MAPseq and QIIME 2 being two recently released alternatives. However, no independent and direct comparison between these four main tools has been performed. Here, we compared the default classifiers of MAPseq, mothur, QIIME, and QIIME 2 using synthetic simulated datasets comprised of some of the most abundant genera found in the human gut, ocean, and soil environments. We evaluate their accuracy when paired with both different reference databases and variable sub-regions of the 16S rRNA gene. Findings:We show that QIIME 2 provided the best recall and F-scores at genus and family levels, together with the lowest distance estimates between the observed and simulated samples. However, MAPseq showed the highest precision, with miscall rates consistently <2%. Notably, QIIME 2 was the most computationally expensive tool, with CPU time and memory usage almost 2 and 30 times higher than MAPseq, respectively. Using the SILVA database generally yielded a higher recall than using Greengenes, while assignment results of different 16S rRNA variable sub-regions varied up to 40% between samples analysed with the same pipeline. Conclusions:Our results support the use of either QIIME 2 or MAPseq for optimal 16S rRNA gene profiling, and we suggest that the choice between the two should be based on the level of recall, precision, and/or computational performance required.

Gigascience. 2018:7(5) | 4 Citations (from Europe PMC, 2020-02-08)
EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data. [PMID: 26582919]
Mitchell A, Bucchini F, Cochrane G, Denise H, ten Hoopen P, Fraser M, Pesseat S, Potter S, Scheremetjew M, Sterk P, Finn RD.

EBI metagenomics ( is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2016:44(D1) | 42 Citations (from Europe PMC, 2020-02-15)
EBI metagenomics--a new resource for the analysis and archiving of metagenomic data. [PMID: 24165880]
Hunter S, Corbett M, Denise H, Fraser M, Gonzalez-Beltran A, Hunter C, Jones P, Leinonen R, McAnulla C, Maguire E, Maslen J, Mitchell A, Nuka G, Oisel A, Pesseat S, Radhakrishnan R, Rocca-Serra P, Scheremetjew M, Sterk P, Vaughan D, Cochrane G, Field D, Sansone SA.

Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource ( that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

Nucleic Acids Res. 2014:42(Database issue) | 52 Citations (from Europe PMC, 2020-02-15)