均方误差
随机森林
数学
群(周期表)
集合(抽象数据类型)
试验装置
分子描述符
均方根
平均绝对误差
相关系数
Atom(片上系统)
训练集
数量结构-活动关系
化学
统计
人工智能
计算机科学
立体化学
物理
有机化学
嵌入式系统
程序设计语言
量子力学
作者
Ding-ling Kong,Yue Luan,Xiaowei Zhao,Yanhua Lü,Wei Li,Qingyou Zhang,Aimin Pang
标识
DOI:10.1016/j.chemolab.2023.105021
摘要
17817 compounds were collected from the Bradley open melting point data set, including eight elements: C, H, O, N, F, S, Cl, Br, and I. An extended atom-based and bond-based group contribution descriptor was suggested to represent these compounds, which consists of a one-dimensional descriptor based on the Molecular formula, a two-dimensional group contribution descriptor based on atoms and bonds, and a structural feature descriptor. Random forest (RF), Partial Least Squares (PLS), and Deep Learning (DL) methods were used to establish models to predict melting points, and the constructed models were evaluated by correlation coefficient (R), mean absolute error (MAE) and root-mean-square error (RMSE). Among them, the best results were obtained using the model constructed by Random forest: the results of out-of-bag (OOB) cross-validation of the training set are R = 0.8977/MAE = 29.57 °C/RMSE = 40.34 °C; the predicted results of the test set are R = 0.8982/MAE = 29.68 °C/RMSE = 40.63 °C. Compared with the results obtained using the subset of this data set in a literature, the results in this study are better than the corresponding results in the literature. The established model was also used to predict an external data set consisting of 74 compounds retrieved from another literature, and the obtained results are R = 0.8946 °C/MAE = 24.51 °C/RMSE = 34.19 °C, which were significantly better than the corresponding results in the literature. If the descriptor suggested in this study is combined with RDKit descriptor that contains charge and Electronegativity information and so on, better results were achieved: the results of OOB cross-validation of the training set are R = 0.9013/MAE = 29.25 °C/RMSE = 39.76 °C; the results of the test set are R = 0.9017/MAE = 29.34 °C/RMSE = 40.07 °C.
科研通智能强力驱动
Strongly Powered by AbleSci AI