计算生物学
编码
基因组学
放线菌门
基因组
天然产物
生物
基因
药物发现
代谢组学
比较基因组学
功能基因组学
遗传学
生物信息学
生物化学
16S核糖体RNA
作者
James R. Doroghazi,Jessica C. Albright,Anthony W. Goering,Kou‐San Ju,Robert R. Haines,Konstantin Tchalukov,David P. Labeda,Neil L. Kelleher,William W. Metcalf
标识
DOI:10.1038/nchembio.1659
摘要
A global bioinformatic classification of >11,000 biosynthetic gene clusters from >800 bacterial genomes and cross-correlation with metabolomics data from nearly 200 strains sets the stage for targeted natural product discovery. Actinobacteria encode a wealth of natural product biosynthetic gene clusters, whose systematic study is complicated by numerous repetitive motifs. By combining several metrics, we developed a method for the global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic capacity of Actinobacteria in 830 genome sequences, including 344 obtained for this project. The GCF network, comprising 11,422 gene clusters grouped into 4,122 GCFs, was validated in hundreds of strains by correlating confident mass spectrometric detection of known small molecules with the presence or absence of their established biosynthetic gene clusters. The method also linked previously unassigned GCFs to known natural products, an approach that will enable de novo, bioassay-free discovery of new natural products using large data sets. Extrapolation from the 830-genome data set reveals that Actinobacteria encode hundreds of thousands of future drug leads, and the strong correlation between phylogeny and GCFs frames a roadmap to efficiently access them.
科研通智能强力驱动
Strongly Powered by AbleSci AI