March 2, 2017
By Mark Terry, BioSpace.com Breaking News Staff
One area that is increasingly becoming involved in cancer research is bioinformatics and computational science.
For cancer researchers, data science and computational analysis are increasingly vital skills. For data scientists, biostatisticians, data engineers, epidemiologists, mathematicians and IT experts, cancer and life science research is an exploding area of career opportunities.
Data, Data, Data
It has been pointed out that cancer is not a single disease. In a 2013 interview with Harold Varmus, then director of the National Cancer Institute and co-winner of a Nobel Prize for studies on the genetic basis of cancer, he indicated that what had been learned about cancer over the last 40 years fell into three categories.
First, he said, “we’ve learned that cancer is not simply a single disease that affects many parts of the body. It is not, for example, a ‘war on cancer’ as a single enemy. It is many different diseases with common themes that can cause different kinds of disorders in many of our organs.”
Second, it’s become clear that almost every type of scientist is involved in understanding and controlling cancer.
And third, he said, “we’ve made progress using a variety of new techniques and science of many kinds. Progress in the control of cancer has required new knowledge from the many fields of research that the NCI supports.”
Each human genome is made up of three billion base pairs, which are organized into 20,000 to 25,000 genes. That totals around three gigabytes of data for a single person. In addition, those genes are not single entities, they interact with each other and the environment. How they are turned on and off is another complicated variable.
That environment also includes microbes, bacteria, viruses and fungi, which live inside the body or on the body. The NIH’s Human Microbiome Project has found more than 10,000 microbes in the human body, totaling more than 100 times the genes in the human body. The Harvard Public School of Health used data science to identify about 350 of the most important organisms, and using DNA sequencing, analyzed 3.5 terabytes of genomic data to pinpoint genetic “name tags,” that can identify where and how those markers behave in a healthy population.
The point is, even in healthy patients, there’s an enormous amount of genomic data to process, even before you get into abnormal genetics and the variability of the immune system, tumor markers, and proteins, both normal and abnormal.
Basically, there’s so much information involved in cancer research now that data scientists, data engineers, statisticians, epidemiologists, and bioinformaticists, etc., are increasingly finding roles in cancer research.
And at the very least, everyone else involved in cancer research, from biochemists to physicians to molecular geneticists, increasingly find the value of having computational skills.
Computational Skills
A recent article in Science wrote, “The success of such drugs [targeted therapies] has fueled a push toward studying basic molecular mechanisms of cancer growth, which has brought with it ‘a crush of data so large that no human brain alone would be able to make heads or tails of it,’ says Levi Garraway, assistant professor of medicine at the Dana-Farber Cancer Institute in Boston, Massachusetts. The complexity of the technology and tasks required to design targeted drugs and study their efficacy has growth so great that groups of scientists with varied expertise are required to continue to move the field forward, he adds.
He predicts a huge increase in demand for translational researchers who have computational, analytical, and clinical trial expertise, but who can also turn data into knowledge.
A Few Examples
Based in Cambridge, Mass., Foundation Medicine uses genetics to select cancer drugs. Science writes, “One project involves probing Foundation Medicine’s growing database of tumor profiles for specific mutations and using that information to either design drugs or to parse patients into clinical trials of the company’s drugs.”
2. Assay Development
Wendy Winckler is the executive director of Next Generation Diagnostics at the Novartis Institutes for BioMedical Research, in Cambridge. Prior to that, she worked at the Broad Institute. Her first job was working to build The Cancer Genome Atlas, then she was director of the Genetic Analysis Platform at Broad, which was a multi-disciplinary group of scientists that generated and analyzed diverse types of genomic data. Now, at Novartis, one of the earlier projects she worked on was to help Genoptix, a Novartis company in Carlsbad, Calif., to develop a diagnostic test for lung cancer patients, which looked for “actionable mutations.”
Science notes, “In Winckler’s lab, about half of the scientists have computational expertise, and the other half have extensive wet lab skills. The most successful people, however, engage in both realms, she says. ‘Exposure to lab environments helps computational biologists have a more intuitive understanding of the data and an easier time planning sequencing experiments; lab scientists familiar with data analysis approaches can provide important insights while interpreting results.”
3. Next-Generation Sequencing (NGS)
Technology has managed to quickly get to the point where the human genome can be sequenced for less than $1,000. It’s getting smaller, faster, and cheaper. It’s also revolutionizing drug development and cancer research. Science writes, “Where once it was possible to test tumor samples for only one mutation or genomic rearrangement at a time, NGS technology now enable testing for multiple gene mutations in multiple samples simultaneously.”
4. Drug Recycling
Atal Butte, a researcher at Stanford University, recently helped found NuMedii, which uses data science to sift through molecular data to discover new uses for old drugs.
5. Crowdsourcing
In 2011, players of an online game called Foldit created an accurate 3D model of the M-PMV retroviral protease enzyme on the game. Prior to that, researchers had spent 15 years unsuccessfully trying to ascertain its structure. They followed that up the next year by redesigning a protein that increased its activity by more than 18 times.
Launched in September 2016 by Priscilla Chan, a physician, and her husband, Facebook executive officer Mark Zuckerberg, the CZ Initiative has the lofty goal of curing, preventing or managing all diseases by the end of the century. And they plan to plow $3 billion into it over the next 10 years. One of the first things the CZ Initiative did was provide up to $1.5 million five-year grants to 47 researchers.
A look through the bios of the investigators and their projects shows just how much data computation is a part of many of these projects. For example, Carlos Bustamante, of Stanford, “is making the transition from population genetics to a new area, the integration and analysis of massive data coming from consumer, health care, and financial sources. He is focusing on bringing together direct-to-consumer genetics and phenotype data in a secure space that can be explored by academic, industry, and citizen scientists.
And Adam De La Zerda, also at Stanford, is planning to “image 100 million cells in living tissues at single-cell resolution by using optical coherence tomography. One of the potential uses of this technique will be to visualize cancer markers to delineate the margins of tumors.”
Check out the latest Career Insider eNewsletter - March 2, 2017.
Sign up for the free bi-weekly Career Insider eNewsletter.