National Genomics Data Center

BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from NGDC databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator

25,478,445 records from 44 NGDC & Partner databases.

Database Records Number Description
GVM 16,739,594 Genome Variation Map
lncRNASNP2 4,443,771
RMVar 1,615,252 RNA Modification associated variants database
circAltas 610,406 circAtlas 2.0
EWAS Data Hub 597,253 A data hub of DNA methylation array data and metadata
LncBook 409,204 A curated knowledgebase of human long non-coding RNAs.
EWAS Atlas 262,089 A knowledgebase of epigenome-wide association studies
BBCancer 137,210 BBCancer: an expression atlas of blood-based biomarkers in the early diagnosis of cancers
LncExpDB 101,293 Expression Database of Human Long non-coding RNAs
GenTree 63,151 GenTree, the time tree of genes along the evolutionary history
MethBank SRMs 60,479 Methbank, Single-base Resolution Methylomes (SRMs)
Methbank CRMs 60,415 Methbank, Consensus Reference Methylomes (CRMs)
2019 Novel Coronavirus Resources 58,773 2019nCoVR integrated the public sequences from GISAID, NCBI, CNGB and CNCB/NGDC
SEGreg 53,156 Database of specifically expressed genes and regulation
vcg 43,801 Virtual Chinese Genome Database is a dynamic genome database of Chinese population.
BioSample 37,586 Biological Sample Library
CancerSEA 34,227 CancerSEA: a cancer single-cell state atlas
EPSD 30,679 Eukaryotic Phosphorylation Site Database
DEG 28,458 Database of Essential Genes
lnCAR 28,420 lnCAR | A comprehensive resource for lncRNAs from Cancer Arrays
Gene Expression Nebulas 19,446 Gene Expression Nebulas (GEN) is a data portal of gene expression profiles under various conditions derived entirely from RNA-Seq data analysis in multiple species.
dbPAF 18,792 database of Phospho-sites in Animals and Fungi
GSA 12,536 Genome Sequence Archive
ZCURVE_CoVdb 7,054 Database of Essential Genes
GWH 2,054 Genome Warehouse
Database Commons 721 Database Commons is a curated catalogue of biological databases, providing people with easy access to a comprehensive collection of publicly available biological databases encompassing different data types and spanning diverse organisms.
BioCode 675 Archive Bioinformatics Codes for Open Source Projects
PTMD 594 A database of human disease-associated post-translational modifications
CellMarker 467 CellMarker: a manually curated resource of cell markers in human and mouse.
BioProject 226 Biological Project Library
RhesusBase Genes 208
GSA for Human 184 Genome Sequence Archive for Human
EDK 110 Editome Disease Knowledgebase
eLMSG 67 An eLibrary of Microbial Systematics and Genomics
iEKPD 29 Integrated annotations for Eukaryotic protein Kinases, protein Phosphatases & phosphoprotein-binding Domains
PLMD 26 Protein Lysine Modifications Database
hTFtarget 15 In this hTFtarget database, we collected comprehensive human TF ChIP-Seq data and customized an analysis workflow to identify reliable TF targets with taking epigenomic states into account
CGGA 11 Chinese Glioma Genome Atlas
CGDB 6 Circadian Gene Database
DoriC 2 Database of Replication Origins
iUUCD 2 integrated annotations for Ubiquitin and Ubiquitin-like Conjugation Database
GTDB 1 Glycosyltransferases Database
GWAS Atlas 1 GWAS Atlas is a curated resource of genome-wide variant-trait associations
OMix 1 OMix
Database Records Number Description

Powered by EBISearch

Database Records Number Description

Powered by NCBI Entrez