Most of the human genome – 98 percent – is made up of DNA but doesn’t actually encode genes, the recipes cells use to build proteins. The vast majority of genetic mutations associated with cancer occur in these non-coding regions of the genome, yet it’s unclear how they might influence tumor development or growth.
Scientists in the U.S. have identified nearly 200 mutations in noncoding DNA that play functional roles in different cancers. Most of the somatic expression quantitative trait loci (eQTLs) appear to impact on a set of core pathways, which makes it possible to classify tumors into pathways-based subtypes, suggest the researchers at the University of California, San Diego (UCSD) and Stanford University School of Medicine.
“Most cancer-related mutations occur in regions of the genome outside of genes, but there are so incredibly many of them that it’s hard to know which are actually relevant and which are merely noise,” says senior author Trey Ideker, Ph.D., professor at UC San Diego School of Medicine and Moores Cancer Center. “Here for the first time we found about 200 mutations in noncoding DNA that are functional in cancer – and that is about 199 more than we knew before.”
The findings indicate that the identified network of eQTLs is disrupted in 88% of tumors, “suggesting widespread impact of noncoding mutations in cancer,” the team writes in their published paper in Nature Genetics, which is entitled “A Global Transcriptional Network Connecting Noncoding Mutations to Changes in Tumor Gene Expression.”
Human tumors are hugely complex, and there are many subtypes that display different molecular, cellular and clinical characteristics, the authors explain. However, such coding regions account for less than 2% of the human genome and, as the UCSD-led team points out, “attention is now shifting to the greater number of somatic mutations in noncoding regions.” Cancer genomes are “replete” with noncoding mutations, they write, but we still don’t really understand which noncoding mutations are relevant to cancer development and progression.
While initiatives such as The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) have tried to gain new insights into the complexity of different tumor types, their initial focus was initially on protein-coding regions, although TCGA did identify a cancer-related noncoding mutation in the promoter of the telomerase reverse transcriptase (TERT) gene. And while recurrent somatic mutations have been found at other noncoding loci, as the UCSD team points out, “assessing the function of these mutations, if any, has been challenging.
To identify noncoding mutations that are associated with functional effects in cancer, Dr. Ideker’s team turned back to TCGA data and carried out a systematic analysis of 930 tumors, integrating whole genome sequences, matched mRNA expression profiles, and reference transcriptional interaction maps. “The secret sauce was to look for changes in gene expression,” notes Dr. Ideker, who is also founder of the UC San Diego Center for Computational Biology and Bioinformatics and co-director of the Cancer Cell Map Initiative.”
From their resulting datasets and further analyses, the team identified 193 noncoding loci in which mutations were found to disrupt target gene expression. Most of these identified loci were validated in a second, independent cohort of another 3382 tumors from the ICGC cohort.
Interestingly, while somatic eQTLs linked noncoding mutations to the expression levels of 13 known tumor-suppressor genes or oncogenes, most of the noncoding mutations were not associated with already known cancer-associated genes, the authors note. Rather, many of the genes affected by the eQTL mutations “are not yet widely appreciated as cancer drivers,” the team notes, “motivating further studies on the mechanistic basis of noncoding mutations in cancer.”
Many of the identified somatic eQTLs were found to be commonly mutated in specific cancer tissues. “Beyond the promoter of TERT, which is highly mutated in several tissues…we found recurrently mutated loci associated with expression of DHX34 (mutated in 43% of diffuse large B cell lymphoma), TUBBP5 (29% of lymphomas and 17% of liver cancers), HYI (21% of melanoma), and PCDH1 (19% of acute myeloid leukemia), among others,” the team writes. “Of the approximately 200 noncoding mutations that have previously been identified as recurrent in cancer, one-third were also identified here as recurrently mutated loci, including well known mutations in the promoters of PLEKHS1 and DPH3.”
And while most of the somatic eQTLs were mutated in more than one tissue, 12 were mutated almost exclusively in melanoma. In fact, 80% or more of the mutations occurred in melanoma. “Such enrichment for a single tissue was not seen for any other tissue type,” the researchers point out.
Although it isn’t possible to be absolutely certain that the association between all the noncoding mutations and gene expression changes are causal, the team did test a number of the noncoding mutations in cultured cells, to see their effects on the target genes and cellular function. “One example that stood out was a noncoding mutation affecting a gene called DAAM1,” adds first study author Wei Zhang, Ph.D., a postdoctoral researcher in Ideker’s lab. The team’s studies had identified a somatic eQTL upstream of DAAM1, which the results from both patient cohorts showed was recurrently mutated in patients with metastatic melanoma.
Subsequent in vitro studies in which mutations were introduced directly into cultured cells indicated that the mutation causes upregulation of DAAM1 expression, which increases cells’ ability to invade the local microenvironment, “thereby linking this noncoding mutation to DAAM1 overexpression and cell invasion,” they write. “DAAM1 activation makes tumor cells more aggressive, and better able to invade surrounding tissues,” Dr. Zhang states. The team tested out another two of the identified noncoding mutations in cells in vitro to support a causal link between the mutations and gene expression changes.
The researchers also investigated the relationship between the 196 genes that were transcriptionally regulated by somatic eQTLs, and the 138 genes that have previously been documented to have recurrent coding mutations in cancer. “This approach identified a collection of network regions…that stratified tumors into a hierarchy of increasingly specific subtypes.” Among these, four of the subtypes contained a large proportion of patients with noncoding mutations.
One, the “CDKN2A–EGFR–TERT subtype,” was characterized by disruption of the CDKN2A coding sequence, sometimes in combination with noncoding mutations to the TERT promoter, EGFR activation, or BRAF activation. A second tumor subtype, “TERT–BRAF–IDH1,” included tumors with TERT noncoding mutations or amplifications, which in some patients were also combined with coding alterations to functionally related genes such as BRAF and SKP2. The third subtype, “PIK3CA–PEX26–GATA3,” combined coding alterations activating PIK3CA and inactivating GATA3 with noncoding alterations downregulating PEX26. The fourth tumor subtype, “APOBEC2–ARID1A–CTNNB1,” combined noncoding mutations within an enhancer of APOBEC2 and coding alterations in ARID1A and CTNNB1.
The researchers plan to investigate whether there are different cancer subtypes that have a common pattern of both noncoding and coding mutations. One goal will be to find whether such mutational patterns can provide diagnostic or prognostic insights, or help clinicians to prescribe the most effective therapy for individual patients, based on their tumor’s noncoding and coding mutational profile.