The new model helps identify the mutations that cause cancer

Cancer cells can have thousands of mutations in their DNA. However, only a handful of them actually drive cancer progression; the rest are just for the trip.

Distinguishing these harmful driver mutations from neutral passengers could help researchers identify better drug targets. To increase these efforts, an MIT-led team has built a new computer model that can quickly scan the entire genome of cancer cells and identify mutations that occur more frequently than expected, suggesting that they are driving tumor growth. This type of prediction has been challenging because some genomic regions have an extremely high frequency of passenger mutations, drowning out the signal from real drivers.

“We created a probabilistic, deep-learning method that allowed us to get a really accurate model of the number of passenger mutations that should exist anywhere in the genome,” says Maxwell Sherman, an MIT graduate student. “Then we can look all over the genome for regions where there is an unexpected build-up of mutations, which suggests that these are conductive mutations.”

In their new study, researchers found additional mutations in the genome that appear to contribute to tumor growth in 5 to 10 percent of cancer patients. Researchers say the findings could help doctors identify the drugs that would be most likely to successfully treat these patients. Currently, at least 30 percent of cancer patients have no detectable driver mutation that can be used to guide treatment.

Sherman, MIT graduate student Adam Yaari, and former MIT research assistant Oliver Priebe are the lead authors of the study, which appears today in Nature Biotechnology. Bonnie Berger, MIT’s Simons Professor of Mathematics and head of the Computing and Biology group at the Computer Science and Artificial Intelligence Laboratory (CSAIL), is lead author of the study, along with Po-Ru Loh, a professor. assistant at Harvard. Faculty of Medicine and Associate Member of the Broad Institute at MIT and Harvard. Felix Dietlein, an associate professor at Harvard Medical School and Boston Children’s Hospital, is also the author of the paper.

A new tool

Since the human genome was sequenced two decades ago, researchers have been scanning the genome to try to find mutations that contribute to cancer by causing cells to grow uncontrollably or to evade the immune system. This has successfully resulted in targets such as the epidermal growth factor receptor (EGFR), which is commonly mutated in lung tumors, and BRAF, a common motor for melanoma. Both mutations can now be targeted by specific drugs.

Although these targets have been shown to be useful, genes encoding proteins account for only about 2 percent of the genome. The other 98 percent also contain mutations that can occur in cancer cells, but it has been much harder to find out if any of these mutations contribute to the development of cancer.

“There’s really been a lack of computer tools that allow us to look for these driver mutations outside of the regions that encode proteins,” says Berger. “That’s what we were trying to do here: design a computational method that allows us to look at not just 2 percent of the genome that encodes proteins, but 100 percent.”

To do this, the researchers trained a type of computational model known as the deep neural network to look for mutations in cancer genomes that occur more frequently than expected. As a first step, they trained the model in genomic data from 37 different types of cancer, which allowed the model to determine the background mutation rates for each of these types.

“The best thing about our model is that you train it once for a certain type of cancer and it learns the rate of mutation all over the genome simultaneously for that particular type of cancer,” says Sherman. “Below you can see the mutations you see in a cohort of patients with the number of mutations you should expect to see.”

The data used to train the models came from the Roadmap Epigenomics Project and an international data collection called the Pan-Cancer Analysis of Whole Genomes (PCAWG). Analysis of the model of this data gave the researchers a map of the expected mutation rate of passengers throughout the genome, so that the expected rate in any set of regions (up to the only base pair) is can compare to the count of mutations observed in any part of the genome.

Change the landscape

Using this model, the MIT team was able to add to the known landscape of mutations that can cause cancer. Currently, when tumors in cancer patients are examined for cancer-causing mutations, a known conductor will appear about two-thirds of the time. New results from the MIT study offer possible driver mutations for an additional 5-10% of the patient group.

One type of non-coding mutation that the researchers focused on is called “cryptic splicing mutations.” Most genes consist of exon sequences, which encode protein-building instructions, and introns, which are spacer elements that are normally removed from messenger RNA before it is translated into proteins. Cryptic splicing mutations are found in introns, where they can confuse the cellular machinery that splices them. This causes introns to be included when they should not be.

Using their model, the researchers found that many cryptic splicing mutations appear to alter tumor suppressor genes. When these mutations are present, the tumor suppressors are spliced incorrectly and stop working, and the cell loses one of its defenses against cancer. The number of cryptic splicing sites the researchers found in this study represents about 5 percent of the conductor mutations found in tumor suppressor genes.

Pointing out these mutations could offer a new way to treat these patients, the researchers say. One possible approach that is still under development uses short strands of RNA called antisense oligonucleotides (ASOs) to stick on a piece of mutated DNA with the correct sequence.

“If you could make the mutation go away in some way, you would solve the problem. These tumor suppressor genes could continue to work and maybe fight cancer,” says Yaari. “ASO technology is being actively developed, and this could be a very good application for that.”

Another region where researchers found a high concentration of non-coding conductor mutations is in the untranslated regions of some tumor suppressor genes. The TP53 tumor suppressor gene, which is defective in many types of cancer, is known to accumulate many deletions in these sequences, known as untranslated 5 ‘regions. The MIT team found the same pattern in a tumor suppressor called ELF3.

The researchers also used their model to investigate whether common mutations that were already known could also cause different types of cancer. As an example, the researchers found that BRAF, previously linked to melanoma, also contributes to the progression of cancer in smaller percentages of other cancers, including those of the pancreas, liver, and gastroesophagus.

“That says there’s a lot of overlap between the common driver landscape and the rare driver landscape. That offers an opportunity for therapeutic reuse,” Sherman says. “These results could help guide the clinical trials we should establish to expand these drugs from being approved in one cancer to being approved in many cancers and being able to help more patients.”

The research was funded in part by the National Institutes of Health and the National Cancer Institute.

Leave a Comment Cancel Reply