作者
B. J. Levin,Yolanda Y. Huang,Spencer C. Peck,Y. Wei,A. Martínez-del Campo,Jonathan A. Marks,Eric A. Franzosa,Curtis Huttenhower,Emily P. Balskus
摘要
INTRODUCTION The microbes that live in and on our bodies (the human microbiome) profoundly affect human health and disease. For example, within the lower gastrointestinal tract, microbes employ powerful enzymatic chemistry to access recalcitrant nutrients and generate metabolites that mediate interactions with host cells. Given the vast amounts of available sequencing data from human microbiomes, we know surprisingly little about the precise mechanisms by which these activities influence human biology. This knowledge gap arises in part from our poor understanding of microbial enzymes and metabolic processes. Collectively, the genes present in microbiomes (metagenomes) encode millions of uncharacterized enzymes, and approaches are needed to connect these genes to biochemical functions. RATIONALE Efforts to identify the microbial activities encoded within metagenomes (functional profiling) have largely focused on assigning protein sequences found in these data sets to overarching processes (e.g., “vitamin biosynthesis”) or large enzyme superfamilies whose members carry out many different chemical reactions. These methods therefore provide limited information about specific enzymes of interest and cannot easily differentiate superfamily members with known and unknown functions. Addressing this problem requires incorporating a mechanistic understanding of how amino acid sequence influences enzymatic activity into metagenomic analyses. We envisioned developing a “chemically guided” functional profiling strategy that would use protein sequence similarity network (SSN) analysis to distinguish functionally distinct members of large enzyme superfamilies and integrate this information into quantitative metagenomics. This method would not only quantify different types of enzymes in metagenomic and metatranscriptomic data sets, but also pinpoint enzymes of unknown function in communities, prioritizing them for further study on the basis of their abundance and distribution. We initially applied this workflow to profile the glycyl radical enzyme (GRE) superfamily, which is one of the most enriched protein families in the human gut microbiome. GREs are O 2 -sensitive enzymes that catalyze key transformations in anaerobic microbial metabolism, including carbohydrate utilization and DNA synthesis. Although the activities of certain gut microbial GREs have been connected to heart, liver, and kidney diseases, as well as autism, numerous members of this superfamily have not yet been biochemically characterized. RESULTS We determined the abundance of individual types of GREs in 378 metagenomes from healthy humans, including two aerobic body sites (vagina and skin), three microaerobic body sites (tongue, inner cheek, and dental plaque), and one anaerobic body site (gut). The human gut microbiome contained the largest number of distinct GREs, many of which have unknown functions. Our analysis provided new information about known GRE-mediated activities, including production of the disease-associated metabolites trimethylamine and p -cresol. In vitro studies of abundant, uncharacterized GREs from the human gut revealed that radical-based dehydration chemistry is widespread in this environment and led to the discovery of trans -4-hydroxy-l-proline (Hyp) dehydratase. This enzyme enables gut commensals and human pathogens like Clostridium difficile to metabolize Hyp, a nonproteinogenic amino acid that is rare in bacteria but is an abundant posttranslational modification in eukaryotes. The universal distribution of this activity in human gut microbiomes suggests that it plays an important role in this habitat, setting the stage for future hypothesis-driven research. CONCLUSION By accurately identifying enzymes present in microbial communities, this workflow allows ecological context to inform enzyme characterization, uncovering widespread but previously unappreciated metabolic activities. We are now poised to apply this strategy to examine various patient populations, additional protein superfamilies, and other microbiomes.