What Lies Within

What Lies Within

What Lies Within 618 372 IEEE Pulse

In 1991, a group of Italian researchers announced that they had isolated a new antibiotic from a chemical soup brewed with a soil-dwelling bacteria called Planobispora rosea. The drug was a type of thiopeptide, effective against grampositive bacteria like Staphylococcus aureus, P. acnes, and C. difficile but uncooperative in terms of being harnessed for human medicines. Little came of that work until around 2012, when pharma giant Novartis reported that it had begun to experiment with the original drug’s structure, ultimately creating a semisynthetic version with enough solubility that it could be effectively administered to human patients. In 2015, that antibiotic successfully made it through a multicenter phase II clinical trial, where it proved to be both safe and reasonably effective in the 30 C. difficile patients who completed the study.
Participating in the study may have been those patients’ first official time taking such a drug, but according to a paper that appeared the year before Novartis published its results, it’s entirely possible that some had already been exposed to something very similar. According to the study, which was authored by a team of researchers led by scientist Michael Fischbach from the University of California, San Francisco, the body is already teeming with microbes capable of producing thiopeptide antibiotics. And not just thiopeptides: as it turns out, the human body’s microbial inhabitants—collectively known as the microbiome— are positively rife with potential drug-coding genes. Some of these genes resemble those found in marine bacteria known for antimicrobial and anticancer effects; some code for nonribosomal peptides, a class also shared by penicillin. Investigating further, the researchers even discovered a new thiopeptide antibiotic called lactocillin— strikingly similar to Novartis’s latest, painstakingly constructed candidate drug—in a bacterial species common to the healthy human vagina.

Human Nature

Turning to nature is one of the oldest tricks in the book when it comes to medicine. Historically, plants have dominated in this field, but with the discovery of penicillin in 1929, researchers recognized the potential in microorganisms like bacteria and fungi. These are the most abundant and diverse of the world’s creatures. Over the course of millions of years of making war and peace with one another and their hosts, they have learned to produce an incredible variety of small molecules, many of which have proven therapeutically useful to us. Over the decades following the discovery of penicillin, scientists scoured the earth for exotic microbial samples and were rewarded with a host of new medicines ranging from vancomycin, to the anticancer drug doxorubicin, and even the very first statin.
Despite these successes, the practice began to dry up in the 1990s. That classical style of drug discovery was hard work and incredibly inefficient. “We used to just grab them and put them into a broth and then hope for the best,” says Nathan Magarvey, a biochemist at McMaster University in Ontario, Canada. It worked for a while, Magarvey continues, but once the low-hanging fruit had been picked, researchers found themselves increasingly isolating and characterizing microbial output, with the only result being something they’d already found. Nature’s blockbusters seemed to be all used up.
Today, new technologies like metagenomics, metabolomics, and informatics have radically changed that picture. Now, with the ability to quickly scan and analyze genomes and their metabolic outputs in bulk, researchers have come to realize that nature is not only not exhausted, but that fewer than 10% of her potential microbial natural products have been tapped. And researchers now have the tools to begin to zero in on specific molecules, estimate their potential, and rule out those chemicals already in the books. “It’s important to realize just how significant these developments have been,” Magarvey says. “We’re no longer just grabbing stuff and randomly growing them.”
The human body wasn’t on the agenda during the first round of natural product discovery, mostly because no one really knew it had anything worth pursuing. That also changed after publication of the National Institutes of Health’s Human Microbiome Project in 2012, with its revelation that the human is home to literally trillions of bacteria, archea, fungi, and virii (Figure 1). The protein-coding genes of these bacteria alone have been estimated to outnumber those of humans by a factor of 360.

Figure 1: Bacteria growing on agar in the laboratory. (Photo courtesy of Bill Branson, National Institutes of Health.)
Figure 1: Bacteria growing on agar in the laboratory. (Photo courtesy of Bill Branson, National Institutes of Health.)

Research into how the microbiome interacts with us and itself is still in its earliest stages, but preliminary work has already turned up a complex selection of tantalizing elements, like antimicrobial ribosomal peptides made by bacteria in the gut and mouth to fend off other bacteria, immunomodulatory sugars, and amino acid metabolites with yet to be discovered effects. The idea of exploring these human-borne communities for potential drugs is very new but may offer certain advantages.
“We evolved with these bacteria and wrapped ourselves around them,” says Karim Dabbagh, chief scientific officer at the San Francisco-based company Second Genome. This means that both we and these microbes have learned to communicate with and benefit from one another. Science has already begun to show how populations of these organisms help us digest our food, manufacture some of our vitamins, moderate our moods, and tune our immune systems. They are intimately involved in our states of health and disease, and they may be particularly relevant when it comes to potential therapeutics.

Proofs of Concept

There are some early proofs of concept testifying to these notions. In 2010, a team in Cork, Ireland, isolated a new antibacterial agent from human fecal samples that was narrowly effective against the genus of bacteria responsible for the gastrointestinal disease C. difficile, but not other, friendlier gut bacteria. And one human-microbe-associated molecule called linaclotide, a variant of a molecule made by a diarrhea-associated E. coli, was approved by the U.S. Food and Drug Administration for constipation in 2012.
As another benefit, researchers can use preexisting connections to their advantage. Instead of diving blindly into soil or water, they can narrow their searches early on by digging into the genomes and metabolic activity of bacteria already shown to have some sort of positive clinical effect in humans or comparing microbial samples from healthy and diseased cohorts and winnowing down to the relevant molecules from there. This is the philosophy of Second Genome, which has placed its bet on being able to produce therapeutics based on small molecules drawn from the body’s microbial communities.
“What drives our mining activity is a phenotypic association of a certain bacterium with a specific biological activity,” Dabbagh explains. “So what we try to do then is say, what are the secreted components—the metabolites or proteins, peptides— that those bacteria use to mediate that effect? If we can identify it, then there’s a medicine there.” One of their first small molecules derived from this line of work is SGM-1019, intended for the treatment of inflammatory bowel disease, which has already completed a phase I clinical safety trial.
Dabbagh points out one more advantage to the notion of mining the body’s hangers-on for its molecules. Most of the focus on human microbiome medicine still revolves around the idea of using bacteria as therapeutics—for example, sophisticated probiotics. “But the pharmaceutical community has not really embraced the live bacteria as something that is going to be transformative yet,” he says. It’s still unclear how much those kinds of interventions will be relevant beyond C. difficile and maybe infectious disease contexts, Dabbagh explains—not to mention the fact that existing drug development pipelines and regulatory pathways are all built around molecules, not living, mutable bacteria.

Microbial Dark Matter

Fischbach’s work in 2014—the one that revealed the thiopeptide- producing microbes—marked a reawakening in exploration into the human microbiome for natural products, using the kinds of computational tools that reopened the field in environmental research. To find their results, the team created a machine-learning software program and trained it to recognize the types of gene clusters responsible for making small molecules and predict roughly what they might be. Then they let the program loose on the metagenomic data from the Human Microbiome Project. It turned up over 14,000 potential drug-making gene clusters, which the team then narrowed down to 3,118 clusters common across different individuals.
“The fact that it worked so well is really striking,” says Curtis Huttenhower, a computational biologist at the Harvard School of Public Health. “There’s a difference between saying, here’s two or three things that might be interesting—and, here’s several thousand.”
The work blew the doors off the field, but it also highlighted the challenge ahead. Of the 3,118 candidates the team found, the majority have yet to be studied: as they point out, “almost nothing is known about their small-molecule products or biological activities.” This is a problem everywhere in natural product discovery: microbial genomes and metagenomes are stacking up, but information on the functions and products of these genes is far behind.
Right now, the general flow for product discovery, human or otherwise, follows a basic pattern: catalog microbial samples, usually via metagenomics or metametabolomics; then generate predictions about what those gene sequences or metabolites might be and do. This often involves comparing the resulting data to existing databases to see what they resemble. Multiple approaches have emerged for this second step over the last handful of years. Some are genetically oriented algorithms like antiSMASH 3.0—the latest version of one of the tools Fischbach used in his 2014 research— along with HUMAnN2 and PRISM (PRediction Informatics for Secondary Metabolomes) out of Huttenhower and Magarvey’s labs, respectively. Additional methods include molecular networking, using mass spectrometry to analyze microbes’ metabolic output in bulk and model the results into networks based on similarity—not to mention traditional functional metagenomics, which entails culturing a living library of thousands of bacterial clones, each containing a different snippet from a metagenome.
The catch is that such tools work best with a solid foundation of preexisting knowledge against which all these tools can compare and contrast new findings. Unfortunately, that’s what the field is lacking. No one yet knows what roughly half of the genes found in the human microbiome actually do. Another quarter or so are marked with some function, but only generically. Think of “hydrolase” as a descriptor. “What’s a hydrolase?” asks Jason Crawford, a chemical biologist at Yale University, whose lab focuses on finding small molecules from a variety of microorganisms and routinely encounters this problem. “That doesn’t really tell us what they do aside from adding water to some various substrate.”

Moving Forward

What would help are comprehensive, sharable databases that include not just a biosynthetic gene or metabolite with minor annotated predictions about its function, but also the context of any and all experimentation that might validate those predictions and some indication of the strength of that evidence. Fortunately, these are beginning to emerge. Last fall, the Genomic Standards Consortium announced the Minimal Information about a Biosynthetic Gene cluster (a.k.a., MIBiG) data standard. This aims to create a consistent practice of gene cluster description that would include information like genetic locus, references to any experimental data, and associated chemicals and molecular targets, compatible across different databases and tools, including antiSMASH. Another project to create a reference database for molecular networking and metabolic data, the Global Natural Product Social Molecular Networking, is under way at the San Diego lab of Pieter Dorrestein.

Figure 2: Making something useful with these bacteria molecules requires iterative cycles of imaging, cultures, assays, and so forth. (Photo courtesy of Bill Branson, National Institutes of Health.)
Figure 2: Making something useful with these bacteria molecules requires iterative cycles of imaging, cultures, assays, and so forth. (Photo courtesy of Bill Branson, National Institutes of Health.)

Prediction and even data sharing aren’t the hard part in this field, however. That falls to the final step: making something useful with these molecules. This is the biochemistry and slog science, with iterative cycles of molecular imaging, cultures, assays, animal work, and basic drug development (Figure 2). It’s much less sexy than metagenomic surveys but just as important, if not more so.
Huttenhower has collaborated with the multinational diabetes initiative DIABIMMUNE, which reported earlier this year on how it was able to winnow down (using Huttenhower’s HUMAnN2 for part of its analysis) to a specific molecule made by a specific bacteria that is rare in infants born in Russia (where the risk of diabetes is very low), yet very prevalent in infants born right across the border in Finland (where the incidence of type 1 diabetes is five- to sixfold higher). It’s a tantalizing find, but if anything is ever to come of it, enormous work remains. The researchers would have to establish whether that molecule is really a primary cause, how exactly it has that effect, and how the infants acquire it in the first place—not to mention all the drug development that would have to go into determining whether it could ever translate into medical use. “This type of chemical sleuthing reminds me of debugging a really complicated computer program,” Huttenhower says. “It’s a series of increasingly detailed steps, so that by the time you actually fix it, you’ve had to unravel a story around a mechanism that’s quirky and specific.”
Similarly, Caltech microbiologist Sarkis Mazmanian isolated an equally persuasive molecule from the mouse gut, where it seems to tie to and exacerbate autisticlike behaviors. But now, as he notes, it’s his responsibility to connect that to the human gut and find out how it works and which receptors and cell types it interacts with. “I phrase it in a couple of sentences, but that’s a decade of work,” says Mazmanian. And most people don’t want to do that: “The number of papers that come out and catalogue whatever it is far outpace the number of papers that show biological function for a microbe or a molecule.”
All of these metagenomic surveys, correlations to different states of disease, and algorithmic predictions lay the brickwork for the path forward, according to Mazmanian. But at the end of the day, somebody has to decide to move forward. “I just don’t see enough people in the microbiome world doing this. I think they’re content to generate lists and move on to a different cohort of patients. I’d like to get this message out to the trainees, to the junior scientists: your job isn’t done once your DNA sequencer spits out its information and analyzes it in your algorithm or bioinformatics. I would argue that your job is just beginning.”

For Further Reading