代表(政治)
点云
点(几何)
计算机科学
云计算
材料科学
人工智能
纳米技术
机器学习
数学
几何学
政治
政治学
法学
操作系统
作者
Soroush Ahmadi,Mohammad Amin Ghanavati,Sohrab Rohani
标识
DOI:10.1021/acs.chemmater.3c01437
摘要
The design and synthesis of cocrystals have emerged as promising crystal engineering strategies for enhancing the physicochemical properties of a diverse range of target molecules. A prediction strategy to identify whether a pair of target and auxiliary molecules would form a cocrystal can greatly accelerate the process of cocrystal discovery. In this study, we compiled and performed DFT calculations for 12,776 molecules (6,388 cocrystals). All entries in the database were obtained from experimental attempts reported in the literature. Electrostatic potential (ESP) surfaces were then extracted from the DFT results and used for the development of four machine learning models (PointNet, ANN, RF, Ensemble). The Ensemble model, leveraging the complementary strengths of the PointNet, ANN, and RF models, demonstrated superior discriminatory performance with a BACC (0.942) and an AUC (0.986) on the unseen test data subset. To assess the performance of the models on individual molecules, we separated the cocrystals of caffeine, fumaric acid, and salicylic acid from the overall database. The Ensemble model exhibited remarkable robustness, classifying the 312 cocrystals in this subset into their respective classes, with an average BACC of 98%. Furthermore, through conducting data analysis, 132 batches of cocrystal instances were gathered. After three batches were excluded, our proposed models were tested with these previously unseen molecules both before and after implementation of a batchwise retraining method.
科研通智能强力驱动
Strongly Powered by AbleSci AI