National Genomics Data Center

BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from BIGD databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator; GenBank
Total: 85307532 records from 139 Databases.
lncRNASNP2 4443771
circAltas 610406 circAtlas 2.0
EWAS Data Hub 597253 A data hub of DNA methylation array data and metadata
EWAS Atlas 262089 A knowledgebase of epigenome-wide association studies
GenTree 63151 GenTree, the time tree of genes along the evolutionary history
Methbank SRMs 60499 Methbank, Single-base Resolution Methylomes (SRMs)
GVM 60088 Genome Variation Map
SEGreg 53156 Database of specifically expressed genes and regulation
vcg 43801 Virtual Chinese Genome Database is a dynamic genome database of Chinese population.
DEG 28458 Database of Essential Genes
BioSample 25080 Biological Sample Library
Gene Expression Nebulas 19446 Gene Expression Nebulas (GEN) is a data portal of gene expression profiles under various conditions derived entirely from RNA-Seq data analysis in multiple species.
dbPAF 18792 database of Phospho-sites in Animals and Fungi
ZCURVE_CoVdb 7054 Database of Essential Genes
2019 Novel Coronavirus Resources 4340 2019nCoVR integrated the public sequences from GISAID, NCBI, CNGB and CNCB/NGDC
GSA 2437 Genome Sequence Archive
2019nCoVR Literature 2406 Literature about COVID-19 and SARS-COV-2.
Database Commons 721 Database Commons is a curated catalogue of biological databases, providing people with easy access to a comprehensive collection of publicly available biological databases encompassing different data types and spanning diverse organisms.
BioCode 675 Archive Bioinformatics Codes for Open Source Projects
PTMD 594 A database of human disease-associated post-translational modifications
CellMarker 467 CellMarker: a manually curated resource of cell markers in human and mouse.
GWH 226 Genome Warehouse
BioProject 146 Biological Project Library
RhesusBase Genes 126
EDK 110 Editome Disease Knowledgebase
GSA for Human 55 Genome Sequence Archive for Human
iEKPD 29 Integrated annotations for Eukaryotic protein Kinases, protein Phosphatases & phosphoprotein-binding Domains
PLMD 26 Protein Lysine Modifications Database
hTFtarget 15 In this hTFtarget database, we collected comprehensive human TF ChIP-Seq data and customized an analysis workflow to identify reliable TF targets with taking epigenomic states into account
CGDB 6 Circadian Gene Database
DoriC 2 Database of Replication Origins
iUUCD 2 integrated annotations for Ubiquitin and Ubiquitin-like Conjugation Database
GSA 1 Genome Sequence Archive
GWAS Atlas 1 GWAS Atlas is a curated resource of genome-wide variant-trait associations

DGVa 43 Database of Genomic Variants Archive
EGA 2619 The European Genome-phenome Archive
HGNC 14 HUGO Gene Nomenclature Committee
MGnify (Analyses) 67351 MGnify is the study of all genomes present in any given environment without the need for prior individual identification or amplification.
MGnify (Projects) 369 MGnify is the study of all genomes present in any given environment without the need for prior individual identification or amplification.
MGnify (Samples) 84984 MGnify is the study of all genomes present in any given environment without the need for prior individual identification or amplification.
WormBase ParaSite 6898 WormBase ParaSite
Study 44243 INSDC Project records from the European Nucleotide Archive
Non-coding (Release) 2722170 Non-coding (Release) in ENA
Non-coding (Update) 1386164 Non-coding (Update) in ENA
SRA Study (Read/Analysis) 22146 Next generation sequencing raw data repository from the European Nucleotide Archive (study part)
SRA Sample 2587043 Next generation sequencing raw data repository from the European Nucleotide Archive (sample part)
SRA Read (Run) 378082 Next generation sequencing raw data repository from the European Nucleotide Archive (run part)
SRA Read (Experimentn) 457686 Next generation sequencing raw data repository from the European Nucleotide Archive (experiment part)
SRA Analysis 6055 Next generation sequencing raw data repository from the European Nucleotide Archive (analysis part)
SRA Submission (Read/Analysis) 1120 Next generation sequencing raw data repository from the European Nucleotide Archive (submission part)
Assembly contig set 28305 European Nucleotide Archive(Whole Genome Shotgun Set)
Transcriptome Assembly contig set 2 European Nucleotide Archive(Transcriptome Assembly contig set)
Coding (Release) 28634903 Coding (Release) in ENA
Coding (Update) 13272683 Coding (Update) in ENA
Assembly 38070 Genome Assembly
IMGT/HLA 24218 The IMGT/HLA Database provides a specialist database for sequences of the human major histocompatibility complex (HLA) and includes the official sequences for the WHO Nomenclature Committee For Factors of the HLA System.
IPD-KIR 986 The IPD-KIR Database provides a centralised repository for human KIR sequences. Killer-cell Immunoglobulin-like Receptors (KIR) have been shown to be highly polymorphic at the allelic and haplotypic level. KIRs are members of the immunoglobulin superfamily (IgSF) formerly called Killer-cell Inhibitory Receptors.
Rfam 18669 The Rfam database is a collection of RNA families
RNAcentral 274246 The RNAcentral sequences are provided by a group of expert databases and supplemented by sequences from the INSDC.
UniProtKB 6250086 UniProt Knowledge Base of protein sequences.
UniRef100 1113892 UniProt Non-redundant Reference Databases - mutual sequence identity of 100%.
UniRef90 505807 UniProt Non-redundant Reference Databases - mutual sequence identity of >90%.
UniRef50 259641 UniProt Non-redundant Reference Databases - mutual sequence identity of >50%.
EPO 1255025 European Patent Office
JPO 943652 Japan Patent Office
KIPO 169033 Korean Intellectual Property Office
USPTO 272258 United States Patent and Trademark Office
EMDB 2670 The Electron Microscopy Data Bank (EMDB) is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, and electron (2D) crystallography.
PDBe 41341 Macromolecular structures database
ChEBI 5929 Chemical Entities of Biological Interest
ChEMBL Assay 414981 Assay details as reported in a scientific document in ChEMBL database
ChEMBL Document 3330 ChEMBL Document in ChEMBL database
ChEMBL Molecule 85 Curated compound set used in ChEMBL database.
ChEMBL Target 182 Curated target set used in ChEMBL database. Includes both protein targets and non-protein targets (e.g., organisms, tissues, cell lines)
ChEMBL Target Component 11 ChEMBL Target Component
ArrayExpress 38440 ArrayExpress Archive is a MIAME compliant public database for microarray data.
Expression Atlas Experiments 1221 Expression Atlas Experiments
Baseline Expression Atlas Genes 775 Large scale meta-analysis of public transcriptomics data
Differential Expression Atlas Genes 29758 Large scale meta-analysis of public transcriptomics data
dbGaP 742 The database of Genotypes and Phenotypes
GEO 24542 Gene Expression Omnibus. GEO is a public functional genomics data repository supporting MIAME-compliant data submissions
Human diseases 19 Human diseases
OMIM 17567 OMIM Online Mendelian Inheritance in Man
294
Complex Portal 773 Library of ligands, small molecules and monomers
IntAct Experiments 2477 Experimental procedures used to characterise molecular interactions
IntAct Interactions 1245 Descriptions of molecular interactions
IntAct Interactors 546 Proteins taking part in molecular interactions
BioModels 676 Database of Mathematical models of biological interest
21
30
MetaboLights 507 Database for Metabolomics experiments and derived information
MetabolomeExpress 1 MetabolomeExpress: a public place to process, interpret and share GC/MS metabolomics datasets.
Metabolomics Workbench 719 The Metabolomics Workbench will serve as a national and international repository for metabolomics data and metadata and will provide analysis tools and access to metabolite standards, protocols, tutorials, training, and more.
12
Reactome 8523 Database of core biochemical pathways and reactions
Rhea 8 Manually annotated database of chemical reactions created in collaboration with the Swiss Institute of Bioinformatics (SIB)
GPCRDB 399 Database of G Protein-Coupled Receptors
Interpro Active site 7 Database of protein families, domains and functional sites
Interpro Binding site 1 Database of protein families, domains and functional sites
Interpro Conserved site 199 Database of protein families, domains and functional sites
Interpro domain 3219 Database of protein families, domains and functional sites
Interpro family 2210 Database of protein families, domains and functional sites
Interpro Homologous super family 207 Database of protein families, domains and functional sites
Interpro PTM 1 Database of protein families, domains and functional sites
Interpro repeat 40 Database of protein families, domains and functional sites
Interpro unknown 7 Database of protein families, domains and functional sites
Pfam (Clans) 5 The clans contained within the database Pfam
Pfam 1151 The protein families contained within the database
TreeFam 6 TreeFam is a database of gene trees of animal protein families.
MEROPS Peptidases 1279 MEROPS Id Peptidase Database
MEROPS Peptidase Clans 1 MEROPS Clan Peptidase Database
MEROPS Peptidase Families 55 MEROPS Peptidase Families Database
GNPS 210 The Global Natural Product Social Molecular Networking (GNPS) site creates a community for natural product researchers working with mass spectrometry data.
GPMdb 198 The Global Proteome Machine
jPOST 67 The ProteomeXchange Consortium has been set up to provide a globally coordinated submission of mass spectrometry proteomics data to the main existing proteomics repositories, and to encourage optimal data dissemination.
LINCS 107 Library of Network-Based Cellular Signatures (LINCS)
MassIVE 939 The Mass spectrometry Interactive Virtual Environment (MassIVE) is a community resource developed by the NIH-funded Center for Computational Mass Spectrometry to promote the global, free exchange of mass spectrometry data.
41
Paxdb 13 PaxDB is a comprehensive absolute protein abundance database, which contains whole genome protein abundance information across organisms and tissues.
PeptideAtlas 1448 PeptideAtlas is a multi-organism, publicly accessible compendium of peptides identified in a large set of tandem mass spectrometry proteomics experiments.
PeptideAtlas 5200 PeptideAtlas is a multi-organism, publicly accessible compendium of peptides identified in a large set of tandem mass spectrometry proteomics experiments.
Enzyme Portal 1717 The Enzyme Portal integrates publicly available information about enzymes, such as small-molecule chemistry, biochemical pathways and drug compounds.
IntEnz 522 Integrated relational Enzyme database.
Europe PMC 3036262 Europe PMC is an archive of life sciences journal literature.
BioSamples 502569 The BioSamples database aggregates sample information for reference samples e.g. Coriell Cell lines and samples for which data exist in one of the EBI's assay databases such as ArrayExpress, the European Nucleotide Archive, or PRIDE. It provides links to assays for specific samples, and accepts direct submissions of samples.
1877
EFO 13 Experimental Factor Ontology (EFO)
GO 237 Gene Ontology
MESH 472 Medical Subject Headings (MeSH)
Ontology Lookup Service (OLS) 20514 The Ontology Lookup Service (OLS) is a repository for biomedical ontologies that aims to provide a single point of access to the latest ontology versions. The user can browse the ontologies through the website as well as programmatically via the OLS API.
SBO 1 Systems Biology Ontology
Taxonomy 5252 NCBI Taxonomy database of Organism names
bio.tools 797 Bioinformatics Tools and Services Discovery Portal
Identifiers.org registry 203 Identifiers.org is a system providing resolvable persistent URIs used to identify data for the scientific community, with a current focus on the Life Sciences domain.
ORCID data claims 177 ORCID is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors and contributors.
Resources 9 EBI resources
People in EBI 47 EBI people
Site 1105 EBI web corporate

Powered by EBISearch