National Genomics Data Center

BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from NGDC databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator

38,794,993 records from 46 NGDC & Partner databases.

Database Records Number Description
GVM 16,739,594 Genome Variation Map
OpenLB 13,167,896 Publication, education and data
lncRNASNP2 4,443,771
RMVar 1,615,252 RNA Modification associated variants database
circAltas 610,406 circAtlas 2.0
EWAS Data Hub 597,253 A data hub of DNA methylation array data and metadata
LncBook 409,204 A curated knowledgebase of human long non-coding RNAs.
EWAS Atlas 262,089 A knowledgebase of epigenome-wide association studies
2019 Novel Coronavirus Resources 194,678 2019nCoVR integrated the public SARS-CoV-2 sequences from GISAID, NCBI, CNGB and CNCB/NGDC
BBCancer 137,210 BBCancer: an expression atlas of blood-based biomarkers in the early diagnosis of cancers
LncExpDB 101,293 Expression Database of Human Long non-coding RNAs
GenTree 63,151 GenTree, the time tree of genes along the evolutionary history
MethBank SRMs 60,479 Methbank, Single-base Resolution Methylomes (SRMs)
Methbank CRMs 60,415 Methbank, Consensus Reference Methylomes (CRMs)
SEGreg 53,156 Database of specifically expressed genes and regulation
BioSample 44,659 Biological Sample Library
vcg 43,801 Virtual Chinese Genome Database is a dynamic genome database of Chinese population.
CancerSEA 34,227 CancerSEA: a cancer single-cell state atlas
EPSD 30,679 Eukaryotic Phosphorylation Site Database
DEG 28,458 Database of Essential Genes
lnCAR 28,420 lnCAR | A comprehensive resource for lncRNAs from Cancer Arrays
Gene Expression Nebulas 19,446 Gene Expression Nebulas (GEN) is a data portal of gene expression profiles under various conditions derived entirely from RNA-Seq data analysis in multiple species.
dbPAF 18,792 database of Phospho-sites in Animals and Fungi
GSA 13,189 Genome Sequence Archive
ZCURVE_CoVdb 7,054 Database of Essential Genes
GWH 6,933 Genome Warehouse
Database Commons 721 Database Commons is a curated catalogue of biological databases, providing people with easy access to a comprehensive collection of publicly available biological databases encompassing different data types and spanning diverse organisms.
BioCode 675 Archive Bioinformatics Codes for Open Source Projects
PTMD 594 A database of human disease-associated post-translational modifications
CellMarker 467 CellMarker: a manually curated resource of cell markers in human and mouse.
BioProject 262 Biological Project Library
GSA for Human 259 Genome Sequence Archive for Human
RhesusBase Genes 208
EDK 110 Editome Disease Knowledgebase
eLMSG 67 An eLibrary of Microbial Systematics and Genomics
NODE 31 The National Omics Data Encyclopedia
iEKPD 29 Integrated annotations for Eukaryotic protein Kinases, protein Phosphatases & phosphoprotein-binding Domains
PLMD 26 Protein Lysine Modifications Database
hTFtarget 15 In this hTFtarget database, we collected comprehensive human TF ChIP-Seq data and customized an analysis workflow to identify reliable TF targets with taking epigenomic states into account
CGGA 11 Chinese Glioma Genome Atlas
CGDB 6 Circadian Gene Database
DoriC 2 Database of Replication Origins
iUUCD 2 integrated annotations for Ubiquitin and Ubiquitin-like Conjugation Database
GTDB 1 Glycosyltransferases Database
GWAS Atlas 1 GWAS Atlas is a curated resource of genome-wide variant-trait associations
OMix 1 OMix
Database Records Number Description

Powered by EBISearch

Database Records Number Description

Powered by NCBI Entrez