The Precise–and Wild–Genomics Revolution

The Precise–and Wild–Genomics Revolution 618 372 IEEE Pulse

Figure 1: This 2001 photo of DNA sequencing equipment at the NIH’s Intramural Research Center, Advanced Technology Center, in Gaithersburg, Maryland, suggests the “hundreds, hundreds, hundreds” of machines required at the time. (Photo courtesy of Eric Green, NHGRI, NIH.)
Figure 1: This 2001 photo of DNA sequencing equipment at the NIH’s Intramural Research Center, Advanced Technology Center, in Gaithersburg, Maryland, suggests the “hundreds, hundreds, hundreds” of machines required at the time. (Photo courtesy of Eric Green, NHGRI, NIH.)

In the epic endeavor to sequence the human genome, it was as though the size of the equipment and amount of effort required were inversely proportional to the microscopic materials being parsed (Figure 1). “Imagine thousands of people, many laboratories … hundreds, hundreds, hundreds of these machines just to generate that first human genome sequence over a six to eight year period,” says Dr. Eric Green, director of the National Human Genome Research Institute (NHGRI) at the National Institutes of Health (NIH) (Figure 2).
Now visualize a turbo-electromagnetic humdinger of a “shrink gun” zapping one of those genome centers. In place of hundreds of people, sprawling labs, and washing-machine-sized deoxyribonucleic acid (DNA) sequencers, there is instead a single person working a desktop instrument, able to sequence a human genome in a day or two. While the shrink gun is pure fantasy, the researcher single-handedly generating a genomic report at her desk is not. This is the scene of genome sequencing today, over a decade after the Human Genome Project was declared complete. The reduction in effort, space, and time entailed to sequence DNA has been as precipitous as it is astonishing; the cost has plummeted roughly a millionfold (Figure 3).

Figure 2: Dr. Eric Green, director of the NHGRI at the NIH. (Photo courtesy of Ernesto Del Aguila.)
Figure 2: Dr. Eric Green, director of the NHGRI at the NIH. (Photo courtesy of Ernesto Del Aguila.)

Figure 3: A graph showing the cost of sequencing a human-sized genome, as tracked by the NHGRI, at the sequencing centers funded by the institute. The data from 2001 through October 2007 represent the costs of generating DNA sequence using Sangerbased chemistries and capillary-based instruments (first-generation sequencing platforms). Beginning in January 2008, the data represent the costs of generating DNA sequence using second-generation (or next-generation) sequencing platforms. The change in instruments represents the rapid evolution of DNA sequencing technologies that has occurred in recent years. (Image courtesy of K.A. Wetterstrand. DNA sequencing costs based on data from the NHGRI Genome Sequencing Program: www.genome.gov/ sequencingcostsdata.)
Figure 3: A graph showing the cost of sequencing a human-sized genome, as tracked by the NHGRI, at the sequencing centers funded by the institute. The data from 2001 through October 2007 represent the costs of generating DNA sequence using Sangerbased chemistries and capillary-based instruments (first-generation sequencing platforms). Beginning in January 2008, the data represent the costs of generating DNA sequence using second-generation (or next-generation) sequencing platforms. The change in instruments represents the rapid evolution of DNA sequencing technologies that has occurred in recent years. (Image courtesy of K.A. Wetterstrand. DNA sequencing costs based on data from the NHGRI Genome Sequencing Program: www.genome.gov/ sequencingcostsdata.)

“What’s happened over the last 13 years has been really a tour de force in technology development because it’s not like it just delivered one new technology,” explains Green. “It actually delivered several—with even more [expected] over the next five to ten years,” he predicts. This revolution has included new biochemical and biophysical ways to deduce the order of DNA’s letters, as well as innovations in miniaturization and optics for reading out products of the biochemical steps. It’s also been accelerated through parallelism—the capacity to have any given machine producing millions of readings of DNA molecules at a time rather than just one.
With this astronomical drop in cost and turnaround time, genetic testing is becoming more common as part of some patient treatment plans. A patient with advanced cancer whose physician orders the ICG100 test from Intermountain Healthcare, for example, receives a comprehensive genomic profile of biopsied tumor cells, which may be able to identify the actionable mutations in those cells that would respond to a specific drug. The time between extracting the DNA from the tumor cells and returning a personalized cancer treatment report to the patient is 10–14 days. Advances in genomics that have made such tests possible are moving us into a new era of precision medicine.

Revolutionizing Our Understanding of Disease

Genomics has been applied to studying diseases spanning from depression to diabetes to high cholesterol. As Dr. Joel Diamond, chief medical officer for Genomics and Precision Medicine at Allscripts, says, “In the area of cardiology, we know that there are syndromes that cause heart arrhythmias or heart abnormalities that have a genomic basis. We know that there are variants of diabetes now—outside the typical Type I and Type II diabetes—that respond very, very [differently to treatments], and their complication rates are very different than what’s been traditionally thought of in diabetes that have the genetic variants of that.” Genomics, in many cases, provides the ability to see a condition through a new lens.
Often, the effect is a substantively revamped way of classifying and, therefore, treating the disease. “Leukemia used to be thought of as one or two diseases, but now that we’re able to understand the molecular aspects of those tumors, it’s about 75 different diseases, all of which have a very different treatment,” explains Dr. Michael Hultner, chief scientist for Health and Life Sciences at Lockheed Martin. “Because of our ability to identify subsets of that disease and tackle each one of those separately and differently, the survival rates for leukemia have gone way up.”
Perhaps the most high-profile example of genomics shifting medicine’s orientation to a disease is that of cancer. Traditionally, cancer patients have been identified and treated according to their anatomical cancer type (bone, lung, breast, etc.)—the idea is so ingrained, we might not even think of it as a choice. Now, doctors seek to identify the genomic basis of cancer, practicing precision oncology to obtain genome sequences of both the tumor and the individual. Comparing these sequences, they can ascertain which mutations are present in the tumor but not in the individual’s healthy genome, and then seek out drugs specific to the affected gene(s).
At the National Cancer Institute (NCI), novel ways of treating cancer are being pioneered. The NCI’s Cancer Therapy Evaluation Program runs an Investigational Drug Branch (IDB) specializing in clinical trials of cancer treatments. The IDB’s chief, Dr. James Zwiebel, explains that having a more precise understanding of the biological drivers behind tumors translates into more exact treatments for patients, including combining more effective drugs with other drugs or with traditional treatments like chemotherapy and radiation. “We are able to incorporate treatments that are targeted against these molecular aberrations that exist in tumors,” says Zwiebel. “That involves combining these so-called molecularly targeted agents, these targeted treatments—be they small molecule chemicals or … biologic agents like antibodies.”
Combining is an operative word here. Even as therapies become more targeted, today a silver-bullet treatment for a cancer is not the norm. “It’s unlikely that any single agent is going to really cure patients of cancer,” Zwiebel points out, while acknowledging that there have been some notable exceptions. As such, drug trials that combine drugs and/or agents are especially important— and it’s how the NCI has been able to make inroads, facilitating collaborations among companies through their contracts and processes so that the trials can feature combinations of treatments. “In the last five years, thousands of patients [have participated in] trials that involved combinations of agents that otherwise may not have been able to be carried out,” Zwiebel notes.
Meanwhile, NHGRI’s partnership with the NCI, ongoing for much of the past decade, has contributed to the mounting quantities of cancer-related genomic data. “Massive numbers of tumors’ genomes have now been sequenced. … There are huge data sets for almost every kind of cancer … catalogs of the genomic changes that have taken place associated with the development of that kind of cancer,” Green says.
NHGRI’s partnership with the cancer community may also serve as a model for collaborative genomics-driven research of other diseases. With genomics–cancer research well underway, NHGRI is turning its attention to other areas, including cardiovascular disease and Alzheimer’s. “What we want to hopefully do is to help bring other institutes, each responsible for different disease areas, along. Then have them just take over. … We want to eventually become less relevant,” Green explains.

Catching Up with Our Ability to Sequence DNA

It’s difficult to envision NHGRI successfully working its way to irrelevance. When talking about advances in genomics (both in terms of translational science and clinical application), many in the field speak of what has not yet been figured out, even while acknowledging the vast amount that has been decoded. As with many technological revolutions, not everything has progressed at the same pace. “Our knowledge is nowhere near as effective as our ability to sequence DNA,” Green says. While we can now generate a genome sequence easily for any patient, we can’t necessarily interpret it.
“Genetic testing is the wild, wild West,” Allscripts’ Diamond point out. Like any truly pioneering act, decoding the human genome acted as a land bridge to a vast, unsettled territory. In many ways, the field represents a thrilling but still inscrutable expanse where tumbleweeds serve an unclear function, wagons have yet to cut rivets in the ground, and smoke may signify a wildfire, a coded message, or something else entirely. Genomic information rolls in at a remarkable clip, but what to make of this information, how to act on it, share it, and store it for later—these gaps loom like grand canyons.

Figure 4: DNA sequencing is a laboratory technique used to determine the exact sequence of bases (A, C, G, and T) in a DNA molecule. The DNA base sequence carries the information a cell needs to assemble protein and ribonucleic acid molecules. DNA sequence information is important to scientists investigating the functions of genes. The technology of DNA sequencing was made faster and less expensive as a part of the Human Genome Project. (Image courtesy of Darryl Leja, NHGRI, NIH.)
Figure 4: DNA sequencing is a laboratory technique used to determine the exact sequence of bases (A, C, G, and T) in a DNA molecule. The DNA base sequence carries the information a cell needs to assemble protein and ribonucleic acid molecules. DNA sequence information is important to scientists investigating the functions of genes. The technology of DNA sequencing was made faster and less expensive as a part of the Human Genome Project. (Image courtesy of Darryl Leja, NHGRI, NIH.)

Genome readouts are akin to the Rosetta Stone—cleared to the point of legibility but still being puzzled over in an attempt to understand what exactly those characters mean (Figure 4). While reading out the genome sequence of a normal cell or a tumor cell is fairly straightforward, “you don’t necessarily know when you find a spelling difference [in the DNA] what it means,” according to Green. The human genome, after all, is not organized by illness—you wouldn’t, for example, find all the heart-disease-related code grouped together. NHGRI has programs set up to begin interpreting all functional parts of the human genome—an effort Green predicts will take decades. And again, once we determine how a part of the genome works, we may still not necessarily know how a spelling difference changes how that part of the genome works.

From Test Results to Patient Care

The challenges of genomic literacy extend from the lab to the examining room. “Doctors have very little knowledge about what tests to order,” says Diamond, himself a family doctor. “Then when I go ahead and order that test … which tests? And how do I do it? And then when that test result comes back, then what do I do with those results?” Diamond asks, expressing personally what thousands of physicians must be thinking.
Besides a deficit of knowledge about which tests to order, once a test is ordered, the results are not necessarily clear-cut. It’s not a matter of simply finding out whether the patient has a mutation or not. “Those test results,” Diamond says, “… if you ever saw the results, they’re … gobbledygook. It doesn’t just come back and say, ‘Yes, [the] patient has ovarian cancer risk.’”

Figure 5: Technicians at the Intermountain Precision Genomics core laboratory extract DNA from tumor cells to identify faulty DNA. (Photo courtesy of Nick Short.)
Figure 5: Technicians at the Intermountain Precision Genomics core laboratory extract DNA from tumor cells to identify faulty DNA. (Photo courtesy of Nick Short.)

Even in cases where a test has been ordered for a patient with a clear diagnosis, physicians need training and support in interpreting the data. Companies are aware of this. Intermountain Healthcare, for example, offers the Molecular Tumor Board as part of its ICG100 test, which means the test includes the interpretive services of a multi-institutional group of scientists, physicians, geneticists, and radiologists. For each patient who gets tested, the Molecular Tumor Board reviews reports generated at Intermountain Precision Genomics; considers the patient’s history, the mutation identified by the test, and treatments taken to date; and then recommends a drug to target the mutation (Figure 5).

Point-of-Care Intelligence and Big Data

Solutions like Intermountain’s Molecular Tumor Board are smart ones, but more scalable, long-term solutions will be needed to serve larger populations by providing meaningful data to the physician at the point of care. “So when a patient is sitting in front of you,” says Diamond, “and their family history or some labs or medication they’ve been on in the past suggest that they might have a genomic basis of that disease … can the system at the point of care say to the physician, ‘This is a patient that might benefit from genetic testing, and here’s the test that’s reasonable for them to do’?”
Such visions for the future of genomics’ clinical applications are bound up in the promise of big data. “In my mind, besides astronomy and weather simulations, genomics is the real big data in science,” proposes Hultner of Lockheed Martin. Genomics data are already proliferating. What the field needs is informatics combined with an extensive electronic infrastructure to, first, enable the sharing of genomics data among institutions, physicians, and individuals so that better clinical care can be determined and offered, and, second, to drive research farther by linking it with larger swathes of data (Figure 6).

Figure 6: A researcher monitors a DNA sequencing machine in 2010 at Intramural Laboratories, NHGRI, NIH, Bethesda, Maryland. (Photo courtesy of Maggie Bartlett, NHGRI.)
Figure 6: A researcher monitors a DNA sequencing machine in 2010 at Intramural Laboratories, NHGRI, NIH, Bethesda, Maryland. (Photo courtesy of Maggie Bartlett, NHGRI.)

“We need to do the research at scale to get the answers,” Hultner says, “but we also need to have the infrastructure to be able to deliver it back to everyday medicine.” A robust genomics research infrastructure would offset the current data silos. It would, emphasizes Hultner, create more opportunities for collaboration “but also allow these clinical decision support systems of the future to go to an authoritative database and get the right answer.”
Many companies have already accumulated large reservoirs of data as a result of the tests they offer. Intermountain, for example, has a significant tissue specimen biorepository. “It is possibly the largest in the United States,” notes Dr. Pravin J. Mishra, lead scientist and director of research and development at Intermountain Precision Genomics (Figure 7). “Whole genome sequencing, which will [generate] approximately 18,000 genomes per year [for our database]… will not only allow us to discover new biomarkers and cancer targets but also [enable] other top leaders in the field to mine our database,” Mishra says. In early June of this year, Intermountain Healthcare, the Stanford Cancer Institute, Providence Health and Services, and Syapse announced that they are forming a consortium, the Oncology Precision Network, to advance cancer care through data sharing and “increased access to clinical trials.” The collaborating institutions include two of the nation’s largest nonprofit health systems, a leading academic research center, and a precision medicine software company. Their stated goal is to facilitate genomics data sharing that will extend the capacity of high-volume-based analytics and bring precision oncology to underserved cancer patients who haven’t yet benefited from either.

Figure 7: Dr. Pravin J. Mishra, lead scientist and director of research and development at Intermountain Precision Genomics. (Photo courtesy of Nick Short.)
Figure 7: Dr. Pravin J. Mishra, lead scientist and director of research and development at Intermountain Precision Genomics. (Photo courtesy of Nick Short.)

Allscripts, meanwhile, is working on a software platform, 2bPrecise, that would map genomic information to phenotypic information (the expression of the disease). The platform, according to Allscripts, would facilitate “connectivity access to previously inaccessible data sources (such as genomic labs),” thereby enabling clients to arrive at “more accurate diagnoses and treatments.”
Diamond says the platform, due to launch in early 2017, will allow a doctor to “slice and dice the data” in lots of different ways. For example, a doctor could consider a patient with heart disease and ask, as Diamond suggests, “Is it due to a genetic variant or not? A known variant or an unknown variant?” Or, Diamond continues, “I can do it in the other way, which is to say, ‘Here’s a person with this mutation, doesn’t look like it expressed itself in any way, but perhaps it is associated with something we didn’t know at all—an ulcer, skin rash, or heart disease or stroke or Alzheimer’s disease, etc.’ That’s going to be a very important part of discovery, from a research standpoint, and in the ability to care for patients on a[n] individual and population level.”

Engineering and Sharing the Data

The need for an information infrastructure for genomics data is becoming more pressing. Lockheed Martin, known for its work in global security and aerospace, launched a health care technology alliance in 2015 that includes several health information technology providers, medical technology companies, and an academic institution. Together with one of those partners, Illumina, which specializes in genomic sequencing and analysis, Lockheed Martin is working on solutions for national-scale genomics programs.
“We have a pretty rich history of applying systems and engineering systems integration practices to bring very complex sets of information together to drive a mission,” notes Lockheed Martin’s Hultner, who leads the company’s precision medicine and genomics practice.
Hultner says that this is the direction toward which health care itself was already headed; Lockheed Martin didn’t run after it. “Where [in the past] a lot of it was intuition and personal knowledge, the amount of information that doctors, and even payers and hospital administrators, have to deal with [now] is beyond an individual’s comprehension,” he marvels. “Information systems are now being brought online to be able to support their mission, and they’re hitting problems that we’ve been dealing with for decades. Systems, upfront engineering, integration, data management, security— all that.”
At the moment, Hultner explains, much genomics research is being done at individual universities on a small scale: “It’s really difficult to tie that all together, except in research publications, into a comprehensive database that can give doctors that really wide view of genetic disease.”
That really wide view may well be the promise of the Precision Medicine Initiative (PMI), announced by President Barack Obama in his 2015 State of the Union address. This national effort seeks to bring precision medicine to more citizens, both in treating disease and in seeking to identify ways that individuals can maintain their good health. The PMI hopes to enlist a million or more U.S. volunteers from diverse backgrounds to contribute a range of health information to its database. From there, it will apply statistics and analytics to “detect associations between genetic and/or environmental exposures and a wide variety of health outcomes”. In effect, Hultner explains, the PMI could establish national libraries of the full spectrum of genetic disease to be used for clinical physician support.

A Disease-Agnostic Approach to Genomics

“I think that collaboration is key here,” says Intermountain’s Pravin Mishra. “We cannot just sit on our data and not share it.” Nation-scale sharing of genomics data could also fuel what many see as integral to advancing translational genomics: a disease-agnostic approach to research.
“We’re studying cardiovascular disease on this project, we’re studying asthma over here, and over here we’re studying …” says NHGRI’s Green, trailing off to indicate that the list of siloed research studies goes on. “We need to solve these things. Our responsibility is to facilitate this in what I refer to as a disease-agnostic way,” he says. “The genome sequence gets laid out, and you don’t really know what variants play a role in which kind of disease. We need to have a holistic way of generating the data and interpreting the data, and then seeing which different parts are relevant to which different disease processes.”
Add to data sharing the other growing technology piece that is entering into the picture: devices such as Fitbit that measure individuals’ physiology, environment, and lifestyle, adding daily to the biodata pool. (As one recent example, in July Fitbit announced a partnership with the Dana-Farber Cancer Institute to track the exercise habits of 3,200 women with early stage breast cancer to research any linkages between exercise and cancer recurrence.) As genetics and environment come together, we will have a more detailed, data-driven account of the complex relationship between nature and nurture and how they express themselves as both health and disease.