化学
天然产物
工作流程
管道(软件)
相似性(几何)
计算生物学
纳米技术
立体化学
人工智能
生物
数据库
计算机科学
材料科学
图像(数学)
程序设计语言
作者
Yanghao Sheng,Jue Wang,Shao Liu,Yueping Jiang
标识
DOI:10.1021/acs.analchem.3c04746
摘要
Molecular networking has emerged as a standard approach for natural product (NP) discovery. However, the current pipeline based on molecular networks tends to prioritize larger clusters comprising multiple nodes. To address this issue, we present the integrated molecular networking workflow for NP dereplication (IMN4NPD). This approach not only expedites the rapid dereplication of extensive clusters within the molecular network but also places specific emphasis on self-looped or pairs of nodes, which are often overlooked by the current methods. By amalgamating the outputs from various computational tools, we efficiently dereplicate compounds falling into specific categories and provide annotations for both large cluster nodes and self-looped or pair of nodes within the molecular network. Furthermore, we have incorporated several fundamentally distinct similarity algorithms, namely, Spec2Vec and MS2DeepScore, for constructing the t-SNE network. Through comparison with modified cosine similarity, we have observed that integrating additional diverse spectral similarity measures, the resulting t-SNE network enhanced the ability to dereplicate NPs. Demonstrating the use case of an ethanol extract of Plumula nelumbinis, we illustrate that an integration of multiple computational solutions with IMN4NPD aids the dereplication, especially self-looped nodes, and in the discovery of novel compounds in NPs.
科研通智能强力驱动
Strongly Powered by AbleSci AI