The 4th Big Data Forum for Life and Health Sciences (October 13-16, 2019)

Biological research has entered the era of big data, including a wide variety of omics data and covering a broad range of health data. Such big data is generated at ever-growing rates and distributed throughout the world with heterogeneous standards and diverse limited access capabilities. However, the promise to translate these big data into big knowledge can be realized only if they are publicly shared. Thus, providing open access to omics & health big data is essential for expedited translation of big data into big knowledge and is becoming increasingly vital in advancing scientific research and promoting human healthcare and precise medical treatment.

Open Biodiversity & Health Big Data
Register

It is our great pleasure to announce that the 2019 Big Data Forum for Life and Health Sciences will be held in October 13-16, 2019. A few renowned biomedical data scientists have agreed to give speeches. Likely, you are also cordially invited to share your work and participate in this excited event.

Looking forward to seeing you in Beijing, China! We will be working hard to ensure your stay not only a fruitful one, but also an enjoyable one!

Organizing Committee

Previous Conferences

Invited Speakers

Peer Bork

Professor
Structural and Computational Biology Unit
EMBL
Germany

Janusz M. Bujnicki

Professor of Biological Sciences
Laboratory of Bioinformatics and Protein Engineering
International Institute of Molecular and Cell Biology in Warsaw
Poland

Chao Chen

Professor
Faculty of Life Science
Central South University
China

Weihua Chen

Professor
College of Life Science & Technology
Huazhong University of Science and Technology
China

Xiangjun Du

Professor
School of Public Health (Shenzhen)
Sun Yat-sen University
China

Lizhi Gao

Professor
Kunming Institute of Botany
Chinese Academy of Sciences
China

Dianjing Guo

Associate Professor
School of Life Sciences
The Chinese University of Hong Kong, China
China

Heui-Soo Kim

Dean
College of Natural Sciences
Pusan National University
Korea

Chwan-Chuen King

Professor
Epidemiology and Preventive Medicine
Taiwan University
China

Sofia Kossida

Professor
University of Montpellier
IMGT®, the international ImMunoGeneTics information system®
France

Shelan Liu

Professor
Zhejiang Provincial Center for Disease Control and Prevention
China

Yuwen Liu

Professor
Agricultural Genomics Institute
Chinese Academy of Agricultural Sciences
China

Jian Lu

Professor
Center for Bioinformatics
School of Life Sciences, Peking University
China

Lijia Ma

Professor
Institute of Biology
Westlake University
China

Jiao Li

Professor
Institute of Medical Information and Library
Chinese Academy of Medical Sciences
China

Jeffrey Townsend

Professor
Center for Biomedical Data Science
Yale University
USA

Ana Tereza Ribeiro de Vasconcelos

Head of the Bioinformatics Laboratory
National Laboratory of Scientific Computation
Bionformatics Laboratory
Brazil

Chen Wu

Professor
Cancer Hospital
Chinese Academy of Medical Sciences
China

Wen Wang

Professor
Research Center For Ecology and Environmental Science
Northwestern Polytechnical University
China

Xingming Zhao

Professor
Institute of Science and Technology for Brain-Inspired
Fudan University
China

Zhaolei Zhang

Professor
Donnelly Centre for Cellular and Biomolecular Research
University of Toronto
Canada

Agenda (tentative)

The conference features selected talks (15mins) and lightning talks (5mins). Please submit your abstract for consideration of oral presentation, particularly for junior researchers/postdocs/graduate students.
Online registration and abstract submission: https://bigd.big.ac.cn/cms/

October 13: Pick-up & Registration
October 14 Monday: Talks
09:00 - 10:20 Session 1, chaired by Zhang Zhang, BIG, CAS
09:00 - 09:10 Welcome and Opening Remarks
Zhang Zhang, On Behalf of the Organizing Committee
09:10 - 09:50 TBD
Peer Bork, EMBL, Germany
09:50 - 10:20 Single Cell Functional Genomics and Endometrium Cell Atlas
Lijia Ma, Westlake University, China
[Abstract]

The widely applied single cell genomic technology accelerates the researches on cell heterogeneity at the levels of DNA, gene expression profiles and chromatin topology by labeling individual cells with cellular barcode. We applied this technology in analyzing the endometrium tissue from Recurrent Implantation Failure (RIF) patients and healthy controls and characterized the distinct cell compositions between them. We demonstrated that the diversity of stromal cell of RIF patients is significantly lower than healthy control, which is consistent with their differences in endometrial lining thickness. We also detected a gene expression pattern that matches mid-to-late luteal phase in the RIF patients but not healthy control, which indicates an advanced or accelerated menstrual cycle in these patients. This study provides an insight of applying the single cell technology to decipher the underlying cell composition during endometrium cycle, and directly connected the lower cellular diversity and advanced gene expression profiles to clinic symptom.

10:20 - 10:40 Group Photo and Tea & Coffee Break
10:40 - 12:10 Session 2, chaired by Lijia Ma, Westlake University, China
10:40 - 11:10 Database Resources of the National Genomics Data Center
Yiming Bao, BIG, CAS
11:10 - 11:40 Epigenetic feature in gastric cancer
Dianjing Guo, The Chinese University of Hong Kong
11:40 - 12:10 TBD
Weihua Chen, Huazhong University of Science and Technology, China
12:10 - 13:30 Lunch and BIG tour
13:30 - 15:10 Session 3, chaired by Yiming Bao, BIG, CAS
13:30 - 14:10 IMGT®, the international ImMunoGeneTics information system®: 30 years of immunoinformatics, present endeavors and perspectives
Sofia Kossida, University of Montpellier, France
14:10 - 14:40 TBD
Xingming Zhao, Fudan University, China
14:40 - 15:10 TBD
Chao Chen, Central South University
15:10 - 15:30 Tea & Coffee Break
15:30 - 17:20 Session 4, chaired by Wenming Zhao, BIG, CAS
15:30 - 16:10 TBD
Wen Wang, Northwestern Polytechnical University, China
16:10 - 16:40 TBD (Slots available, selected from submitted abstracts)
16:40 - 17:00 Genomics, Proteomics & Bioinformatics (GPB) — a rising journal in the field
Yuxia Jiao, Editor of Genomics Proteomics Bioinformatics
18:00 - 20:00 Welcome Dinner
October 15 Tuesday: Talks
09:00 - 10:10 Session 5, chaired by Yiming Bao, BIG, CAS
09:00 - 09:40 Application of next generation sequencing in leukemia – from bench to bedside
Zhaolei Zhang, University of Toronto, Canada
09:40 - 10:10 Data-Driven Medical Studies: Efforts from National Scientific Data Center for Population and Health
Jiao Li, Institute of Medical Information and Library, Chinese Academy of Medical Sciences
[Abstract]

Medical research paradigm is changing, from symptom-based medicine to evidence-based medicine to precision medicine. Nowadays, at the very beginning of a medical study, the principle investigator needs to make detailed data management plan (DMP), indicating the scientific data storage, data license, data sharing, data preservation and etc. In this talk, I will introduce the efforts made by the National Scientific Data Center for Population and Health on scientific data collection, data curation, data sharing, and data management tools development. Furthermore, I’d like to discuss the technical and management challenges in data-driven medical studies.

10:10 - 10:30 Tea & Coffee Break
10:30 - 12:10 Session 6, chaired by Jingfa Xiao, BIG, CAS
10:30 - 11:10 Genomic Impact of Transposable Elements and Its Biological Function
Heui-Soo Kim, Pusan National University, Korea
11:10 - 11:30 TBD
Shuhui Song, BIG, CAS
11:30 - 12:10 TBD
Lizhi Gao, Kunming Institute of Botany, Chinese Academy of Sciences
12:10 - 13:30 Lunch
13:30 - 15:10 Session 7, chaired by Lina Ma, BIG, CAS
13:30 - 14:10 Structural bioinformatics of RNA molecules: integration of data from various sources at different stages of computational 3D structure modeling
Janusz Bujnicki, International Institute of Molecular and Cell Biology in Warsaw, Poland
14:10 - 14:40 Integrated epidemiological informatic system for dengue to influenza - From Surveillance to Epidemiological Investigation and Public Health Policies
Chwan-Chuen King, Taiwan University, China
14:40 - 15:20 TBD
Chen Wu, Cancer Hospital Chinese Academy of Medical Sciences
15:20 - 15:40 Tea & Coffee Break
15:40 - 17:00 Session 8, chaired by Shuhui Song, BIG, CAS
15:40 - 16:10 TBD
Jiang Liu, Beijing Institute of Genomics, Chinese Academy of Sciences
16:10 - 17:00 TBD (Slots available, selected from submitted abstracts)
October 16 Wednesday: Talks
09:00 - 10:10 Session 9, chaired by Zhang Zhang, BIG, CAS
09:00 - 09:40 Mutation, selection, and the somatic evolution of cancer
Jeffrey Townsend, Yale University, United States
09:40 - 10:10 Developing Experimental and Computational Methods to Study the Genetic Basis of Complex Diseases and Traits in the Big Data Era
Yuwen Liu, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences
[Abstract]

With the accumulation of genotype and phenotype data over an increasing big number of individuals, genome-wide association studies have identified many genetic variants associated with complex diseases and traits. However, we lack high-throughput experimental tools to understand the biological functions of these variants. We also lack computational methods to integrate multi-omics data, or differenttypes of “big biology data”, to explore the genetic basis of complex diseases and traits. In my talk, I will present two experimental methods and one computational method that we have built/extended in the past in the field of human genetics. I will also talk a little bit about my future work focusing on utilizing big data in animal genetics to accelerate livestock genetic improvement.

10:10 - 10:30 Tea & Coffee Break
10:30 - 12:10 Session 10, chaired by Lili Hao, BIG, CAS
10:30 - 11:10 From virus to human genome through bioinformatics
Ana Tereza Ribeiro de Vasconcelos, National Laboratory of Scientific Computation, Brazil
[Abstract]

Bringing together the expertise of bioinformaticians and geneticists is crucial, since very specific and fundamental computational approaches are required for virus, microorganism and human genome research, particularly in an era of big data. Improve existing analytical tools, computational resources, data sharing approaches, new diagnostic tools, and bioinformatic training are crucial. In this talk I will present results in collaboration with several research groups in Brazil related to the Zika virus, neglected and genetic diseases and resistance to antibiotics.

11:10 - 11:40 Effects of long-term exposure to air pollution on the resurgence of scarlet fever in China: a 15-year surveillance study
Shelan Liu, Zhejiang Provincial Center for Disease Control and Prevention, China
[Abstract]

Background: Some researches have projected a relationship between scarlet fever and meteorology and air pollution in a specific regions or cities, but the resurge mechanism of scarlet fever after 2011 in China has not interpreted in view of nationwide.
Objectives: Aimed to investigate the association between the resurgence of scarlet fever and long-term exposure to air pollutants and weather conditions in China. Methods: Data on scarlet fever were from the National Notifiable Disease Reporting System. Air pollutants were from the National Environmental Protection Department, and weather conditions were from the National Meteorological Information Center. A lag non-linear model (DLNM) was used to estimate the excess risk of scarlet fever associated with air pollutants and weather conditions.
Results: 655,039 scarlet fever cases were reported during 2004-2018. It started to surge in 2011 (4.7638 per 100 000), and peaked in 2018 (5.6736 per 100 000). The average incidence in 2011-18 was twice that in 2004-10 (rate ratio=2.302; 95% CI: 2.289-2.314; p<0.001]. There was a low-moderate correlation between scarlet fever and monthly NO2 concentration (r=0.21), sunlight (r=0.28), and wind speed (r=0.24), but the others were weak [PM10 (r=0.13), ozone (r=0.11), and PM2.5 (r=0.05)]. By contrast, it was inversely correlated with monthly relative humidity (RH, r=-0.37), precipitation (r=-0.24) and mean temperature (r=-0.2). A one-unit increment of NO₂ concentration was associated with scarlet fever increased (RR=1.5, 95% CI: 1.35-1.66). The RR was recorded at lag 0 at 42 μg/m3 NO2 concentration (RR=1.15). NO2 pooled estimates varied substantially across China (RR=1.02~4.16), but were higher in northern parts.
Conclusions: Long-term exposure to ambient NO₂ was associated with scarlet fever resurgence, triggered by low RH, temperature, precipitation, and high wind speed and sunshine. Further interventions to reduce NO2 emissions might suppress the resurgence of scarlet fever.

11:40 - 12:10 Big Data Integration Based on Mechanistic Model
Xiangjun Du, School of Public Health (Shenzhen), Sun Yat-sen University, China
[Abstract]

Take transmission dynamics of seasonal influenza virus as an example, I will show you the idea of using theoretical modelling to address critical questions in life and health sciences by integrating big data from multiple sources. I will show you the successes of long-term incidence forecasting of seasonal influenza virus recent years for the United States, and the challenges due to the complexity of pathogen evolution. Data integration based on mechanistic model is a way forward for study the complex diseases in life and health sciences.

12:00 - 13:30 Lunch
13:30 - 15:10 Session 11, chaired by Zhenglin Du, BIG, CAS
13:30 - 14:10 Less is more: Cancer cells evolve to use amino acids more economically
Jian Lu, Peking University, China
[Abstract]

Rapidly proliferating cancer cells have much higher demand for proteinogenic amino acids than normal cells. The use of amino acids in human proteomes is largely affected by their bioavailability, which is constrained by the biosynthetic energy cost in the living organisms. Conceptually distinct from gene-based analyses, we introduce the energy cost per amino acid (ECPA) to quantitatively characterize the use of 20 amino acids during protein synthesis in human cells. By analyzing gene expression data from The Cancer Genome Atlas, we find that cancer cells evolve to utilize amino acids more economically by optimizing gene expression profiles. We further validate this pattern in an experimental evolution of xenograft tumors. ECPA not only shows robust prognostic power across many cancer types, but also improves the prediction of tumor response to checkpoint inhibitor immunotherapy. Our ECPA analysis reveals a common principle during cancer evolution.

14:10 - 15:25 TBD (Slots available, selected from submitted abstracts)
15:25 - 15:45 TBD
Zhenglin Du, BIG, CAS
15:45 - 16:00 Closing Remarks
Zhang Zhang, BIG, CAS

Hotel