回顾性分析
计算机科学
人工智能
集合(抽象数据类型)
机器学习
计算生物学
化学
生物
程序设计语言
有机化学
全合成
作者
Shuangjia Zheng,Tao Zeng,Chengtao Li,Binghong Chen,Connor W. Coley,Yuedong Yang,Ruibo Wu
标识
DOI:10.1038/s41467-022-30970-9
摘要
Abstract The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs.
科研通智能强力驱动
Strongly Powered by AbleSci AI