数量结构-活动关系
分子描述符
人工智能
计算机科学
机器学习
训练集
图形
化学
理论计算机科学
作者
Dmitry Zankov,Mariia Matveieva,Aleksandra Nikonenko,Ramil Nugmanov,Igor I. Baskin,Alexandre Varnek,Pavel Polishchuk,Timur Madzhidov
标识
DOI:10.1021/acs.jcim.1c00692
摘要
Modern QSAR approaches have wide practical applications in drug discovery for designing potentially bioactive molecules. If such models are based on the use of 2D descriptors, important information contained in the spatial structures of molecules is lost. The major problem in constructing models using 3D descriptors is the choice of a putative bioactive conformation, which affects the predictive performance. The multi-instance (MI) learning approach considering multiple conformations in model training could be a reasonable solution to the above problem. In this study, we implemented several multi-instance algorithms, both conventional and based on deep learning, and investigated their performance. We compared the performance of MI-QSAR models with those based on the classical single-instance QSAR (SI-QSAR) approach in which each molecule is encoded by either 2D descriptors computed for the corresponding molecular graph or 3D descriptors issued for a single lowest energy conformation. The calculations were carried out on 175 data sets extracted from the ChEMBL23 database. It is demonstrated that (i) MI-QSAR outperforms SI-QSAR in numerous cases and (ii) MI algorithms can automatically identify plausible bioactive conformations.
科研通智能强力驱动
Strongly Powered by AbleSci AI