Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

General information

URL: https://portal.gdc.cancer.gov
Full name: The Cancer Genome Atlas
Description: A landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types.
Year founded: 2013
Last update:
Version:
Accessibility:
Manual:
Accessible
Real time : Checking...
Country/Region: United States

Classification & Tag

Data type:
Data object:
NA
Database category:
Major species:
Keywords:

Contact information

University/Institution: National Cancer Institute
Address: Center for Cancer Genomics 31 Center Drive, Room 10A11 Bethesda, MD 20892
City: Bethesda
Province/State:
Country/Region: United States
Contact name (PI/Team): Harold Varmus
Contact email (PI/Helpdesk): tcga@mail.nih.gov

Publications

31344359
Before and After: Comparison of Legacy and Harmonized TCGA Genomic Data Commons' Data. [PMID: 31344359]
Gao GF, Parker JS, Reynolds SM, Silva TC, Wang LB, Zhou W, Akbani R, Bailey M, Balu S, Berman BP, Brooks D, Chen H, Cherniack AD, Demchok JA, Ding L, Felau I, Gaheen S, Gerhard DS, Heiman DI, Hernandez KM, Hoadley KA, Jayasinghe R, Kemal A, Knijnenburg TA, Laird PW, Mensah MKA, Mungall AJ, Robertson AG, Shen H, Tarnuzzer R, Wang Z, Wyczalkowski M, Yang L, Zenklusen JC, Zhang Z, Genomic Data Analysis Network, Liang H, Noble MS.

We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations-comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the 'legacy' GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as 'harmonized' by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve.

Cell Syst. 2019:9(1) | 72 Citations (from Europe PMC, 2024-04-20)
29625045
The Cancer Genome Atlas: Creating Lasting Value beyond Its Data. [PMID: 29625045]
Hutter C, Zenklusen JC.

The Cancer Genome Atlas (TCGA) team now presents the Pan-Cancer Atlas, investigating different aspects of cancer biology by analyzing the data generated during the 10+ years of the TCGA project.

Cell. 2018:173(2) | 331 Citations (from Europe PMC, 2024-04-20)
29625055
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. [PMID: 29625055]
Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, Kovatich AJ, Benz CC, Levine DA, Lee AV, Omberg L, Wolf DM, Shriver CD, Thorsson V, Cancer Genome Atlas Research Network, Hu H.

For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.

Cell. 2018:173(2) | 1595 Citations (from Europe PMC, 2024-04-20)
28600341
The NCI Genomic Data Commons as an engine for precision medicine. [PMID: 28600341]
Jensen MA, Ferretti V, Grossman RL, Staudt LM.

The National Cancer Institute Genomic Data Commons (GDC) is an information system for storing, analyzing, and sharing genomic and clinical data from patients with cancer. The recent high-throughput sequencing of cancer genomes and transcriptomes has produced a big data problem that precludes many cancer biologists and oncologists from gleaning knowledge from these data regarding the nature of malignant processes and the relationship between tumor genomic profiles and treatment response. The GDC aims to democratize access to cancer genomic data and to foster the sharing of these data to promote precision medicine approaches to the diagnosis and treatment of cancer.

Blood. 2017:130(4) | 125 Citations (from Europe PMC, 2024-04-20)
27653561
Toward a Shared Vision for Cancer Genomic Data. [PMID: 27653561]
Grossman RL, Heath AP, Ferretti V, Varmus HE, Lowy DR, Kibbe WA, Staudt LM.
N Engl J Med. 2016:375(12) | 756 Citations (from Europe PMC, 2024-04-20)
24071849
The Cancer Genome Atlas Pan-Cancer analysis project. [PMID: 24071849]
Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM.

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

Nat Genet. 2013:45(10) | 3709 Citations (from Europe PMC, 2024-04-20)

Ranking

All databases:
16/6000 (99.75%)
Genotype phenotype and variation:
5/852 (99.531%)
Expression:
4/1143 (99.738%)
Structure:
5/841 (99.524%)
Health and medicine:
2/1394 (99.928%)
Modification:
2/287 (99.652%)
16
Total Rank
6,448
Citations
586.182
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2019-08-01
Curated by:
Zhang Zhang [2021-10-08]
Lina Ma [2019-12-26]
Dong Zou [2019-08-01]