计算生物学
工作流程
系统发育树
基因组
聚类分析
鉴定(生物学)
相似性(几何)
生物
基因
基因组
计算机科学
数据挖掘
遗传学
生态学
机器学习
人工智能
数据库
图像(数学)
作者
Jorge C. Navarro-Muñoz,Nelly Sélem‐Mójica,Michael W. Mullowney,Satria A. Kautsar,James H. Tryon,Elizabeth I. Parkinson,Emmanuel L. C. de los Santos,Marley Yeong,Pablo Cruz‐Morales,Sahar Abubucker,Arne Roeters,Wouter Lokhorst,Antonio Fernàndez-Guerra,Luciana Teresa Dias Cappelini,Anthony W. Goering,Regan J. Thomson,William W. Metcalf,Neil L. Kelleher,Francisco Barona‐Gómez,Marnix H. Medema
标识
DOI:10.1038/s41589-019-0400-9
摘要
Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues. Two bioinformatic tools, BiG-SCAPE and CORASON, enable sequence similarity network and phylogenetic analysis of gene clusters and their families across hundreds of strains and in large datasets, leading to the discovery of new natural products.
科研通智能强力驱动
Strongly Powered by AbleSci AI