The new model reveals mutations that cause cancer

In this interview, News-Medical talks to Maxwell Sherman, an MIT graduate student and one of the lead authors of a study that used a new method to investigate cancer genomes.

Can you please introduce yourself, tell us about your scientific training and what inspired your latest research?

We are a multidisciplinary team of computer scientists, mathematicians and biologists lucky enough to work in the MIT and Harvard ecosystems. Most previous work that has attempted to identify mutations that drive the onset and progression of cancer has focused on 2% of the genome encoding proteins. We wanted to empower the cancer research community to look at 100% of the genome for mutations that could cause cancer.

Cancer cells can have thousands of mutations in their DNA. What is the difference between a mutation that drives cancer progression and a relatively neutral mutation?

Cancer can be understood through the lens of Darwinian evolution. Conductive mutations allow a cell to grow and divide more quickly, thus producing more cells as offspring. Cancer results from this cellular race: once a cell accumulates enough of these conductive mutations, it can divide without limits, escape the immune system, and eventually spread to other tissues, all the hallmarks of cancer. . On the other hand, “passenger” neutral mutations are mutations that do not affect a cell’s ability to grow or reproduce and therefore play no role in the Darwinian evolution of cells. . The vast majority of somatic mutations in our cells appear to be neutral.

Image credit: Kateryna Kon / Shutterstock.com

What do we currently know and do not know about the mutations that cause cancer?

This is a great question that is difficult to answer succinctly and accurately. Suffice it to say that decades of research have revealed the major drivers of numerous types of cancer, which have led to many advances in the ability of medicine to treat patients in the clinic. However, there is still an immense amount that we do not know. We are unaware of the full spectrum of controller mutations in the non-coding genome, revealing all the complexities of copy number variation (thanks to recent Nature articles making great strides at this) or the role of repeated expansions. But there are certainly so many things we don’t know that are yet to be discovered.

A new model allowed you to scan the genome of cancer cells. Could you describe the model and what new ideas it provided?

Our model uses a deep learning procedure to map genomic mutation rates throughout the genome for a cancer of interest. It then uses a custom probabilistic model to query these maps almost instantly to estimate the number of passenger mutations that should be in any region of the genome.

Our approach has several key features: 1) a mutation rate map should be trained only once for a given type of cancer (and we have already formed and made publicly available maps for 37 types of cancer) . It can then be applied to any cohort of patients with this type of tumor; 2) users have the flexibility to specify regions to any part of the genome until the resolution of a single base pair; 3) Our model is fast and efficient enough for users to complete the analysis of the entire genome in a matter of minutes on a personal computer.

One type of non-coding mutation you focused on was cryptic splicing mutations. What are cryptic splicing mutations and how do they cause cancer?

Cryptic splicing mutations are mutations that occur far from the limits of the exons of a gene, but which nevertheless confuse the cellular machinery that is responsible for binding introns and binding exons again. These mutations result in incorrect gene splicing. This often results in meaningless mRNA transcripts that the cell only recycles or a non-functional protein. Either way, the correct protein product of the gene is not being made. Tumor suppressor genes often slow down cell division, preventing a cell from dividing uncontrollably. Cryptic splicing mutations can cause these genes to not work, eliminating the cell’s defenses against cancer.

This new model also allowed you to look at known mutations that cause cancer. What did you learn about these mutations within the 37 different types of cancer you studied?

We found that genes that often drive one type of cancer can also occasionally cause other types of cancer. Dig construction was key to this vision. Because our model can be trained on one set of patients and applied to another set, we were able to pool thousands of patient samples from heterogeneous sequencing studies, providing the statistical power needed to examine these rare events.

Image credit: Design_Cells / Shutterstock.com

Given that your model used a deep neural network, a type of deep learning, how do you see types of machine learning influencing cancer research in the future?

As the field generates larger data in size and complexity, the need for tools that can automatically analyze and extract meaning from these data sets will only increase. Machine learning algorithms can provide powerful approaches to this challenge. It can be especially powerful for generating and prioritizing data-based hypotheses about molecular mechanisms, which can then be explored experimentally, an approach that the field (including our laboratory) is increasingly adopting.

How can the results of your study and the model itself influence the future development of cancer therapy?

We hope the cancer community will make important discoveries about cancer biology by exploring the non-coding genome. Each discovery has the potential to open new avenues for therapeutics. Our model is a tool to help the cancer community do just that.

What’s next for you and your research?

We are working on some new and exciting things that we hope to share soon.

Where can readers find more information?

About Maxwell Sherman

I am currently a fourth year doctorate. candidate jointly supervised by Professor Bonnie Berger and Professor Po-Ru Loh. My research focuses on developing algorithms to discover the role of somatic mutations in human health and disease.