天然产物
分类器(UML)
碳-13核磁共振
人工智能
数据集
计算机科学
化学
机器学习
立体化学
作者
Saúl H. Martínez-Treviño,Víctor Uc-Cetina,María A. Fernández‐Herrera,Gabriel Merino
标识
DOI:10.1021/acs.jcim.0c00293
摘要
Structure elucidation of chemical compounds is a complex and challenging activity that requires expertise and well-suited tools. To assign the molecular structure of a given compound, 13C NMR is one of the most widely used techniques because of its broad range of structural information. Taking into account that molecules found in nature can be grouped into natural product (NP) classes because of structural similarities, we explore the possibility of NP class prediction via 13C NMR data. Employing freely available 13C NMR data of NPs, we trained four classifiers for the prediction of eight common NP classes. The best performance was obtained with the XGBoost classifier reaching f1-scores of above 0.82. We also performed experiments with different percentages of positive samples, including the glycoside presence. Furthermore, we tested cases outside the data set, yielding performances above 80% for most classes. For the chromans case, we restricted the test examples to the coumarin subclass, and the prediction accuracy increased to 100%.
科研通智能强力驱动
Strongly Powered by AbleSci AI