All Data

It is highly recommended that you download the dataset using a dedicated FTP tool, such as FileZilla Client .
Host address:     ftp://download.big.ac.cn/ewas/datahub/EWAS_db/

Baseline Data

It is highly recommended that you download the dataset using a dedicated FTP tool, such as FileZilla Client .
Host address:     ftp://download.big.ac.cn/ewas/datahub/download/

Description Download (.txt & .RData) File Size
DNA methylation profiles of 31 organism parts tissue_methylation.zip 7.7 GB
Sample information of DNA methylation profiles of 31 organism parts sample_tissue_methylation.zip 62.2 KB
DNA methylation profiles of of 25 brain parts brain_methylation.zip 2.77 GB
Sample information of DNA methylation profiles of of 25 brain parts sample_brain_methylation.zip 27.2 KB
DNA methylation profiles 25 blood cell types blood_methylation.zip 4.86 GB
Sample information of DNA methylation profiles 25 blood cell types sample_blood_methylation.zip 41.8 KB
DNA methylation profiles of male and female in 24 tissues sex_methylation.zip 4.33 GB
Sample information of DNA methylation profiles of male and female in 24 tissues sample_sex_methylation.zip 38 KB
DNA methylation changes with age age_methylation.zip 11.73 GB
Sample information of DNA methylation changes with age sample_age_methylation.zip 89.3 KB
DNA methylation profiles of 6 ancestry categories ancestry_category_methylation.zip 1.96 GB
Sample information of DNA methylation profiles of 6 ancestry categories sample_ancestry_category_methylation.zip 20.9 KB
DNA methylation changes with BMI bmi_methylation.zip 3.06 GB
Sample information of DNA methylation changes with BMI sample_bmi_methylation.zip 28.4 KB
DNA methylation profiles of 39 cancers cancer_methylation.zip 16.07 GB
Sample information of DNA methylation profiles of 39 cancers sample_cancer_methylation.zip 117.3 KB
DNA methylation profiles of 28 diseases disease_methylation.zip 20.11 GB
Sample information of DNA methylation profiles of 28 diseases sample_disease_methylation.zip 154.1 KB
Gaussian Mixture Quantile Normalization (GMQN)

Methods

To remove the batch effects and other unwanted noise, we develop Gaussian Mixture Quantile Normalization (GMQN), a reference based method that removes unwanted technical variations at signal intensity level. GMQN adjusts batch effects as well as bias associated with type II probe values in 450k and EPIC/850K studies. The principle behind this method is that the signal intensity of each channel displays a Gaussian mixture distribution. The first component is the background signal which has a mean slightly greater than 0. The second component is the signal from probes which have been hybridized to input DNA successfully. Variance of the second component is much larger than the first component because the degrees of hybridization are different among probes.

The object of GMQN is to rescale the signal intensity to make the two Gaussian component from different array have the same mean and variance. There are four steps to perform GMQN.

  1. Fitting of a two-state Gaussian mixture model to the median values of each type I probe signal intensity from a large single study (GEO project id: GSE105018). The mean and variance of two components are used as reference for rescaling type I probes.
  2. Fitting of a two-state Gaussian mixture model to the input type I probe signal intensity.
  3. For type I probes from each component of input data, transform their probabilities to quantiles using the inverse of the cumulative Gaussian distribution with mean and variance estimated from the corresponding reference component.
  4. Calculating beta value and normalizing type II probes on the basis of type I probes using BMIQ.

Citation

GMQN: A Reference-Based Method for Correcting Batch Effects and Probe Bias in HumanMethylation BeadChip. Front.Genet. 2022. [PMID=35069703]