On Some Novel Similarity-Based Functions Used in the ML-Based q-RASAR Approach for Efficient Quantitative Predictions of Selected Toxicity End Points

数量结构-活动关系 相似性(几何) 支持向量机 分子描述符 偏最小二乘回归 人工智能 计算机科学 试验装置 超参数 模式识别(心理学) 随机森林 集合(抽象数据类型) 线性回归 数据挖掘 机器学习 数学 图像(数学) 程序设计语言
作者
Arkaprava Banerjee,Kunal Roy
出处
期刊:Chemical Research in Toxicology [American Chemical Society]
卷期号:36 (3): 446-464 被引量:48
标识
DOI:10.1021/acs.chemrestox.2c00374
摘要

The novel quantitative read-across structure–activity relationship (q-RASAR) approach uses read-across-derived similarity functions in the quantitative structure–activity relationship (QSAR) modeling framework in a unique way for supervised model generation. The aim of this study is to explore how this workflow enhances the external (test set) prediction quality of conventional QSAR models by the incorporation of some novel similarity-based functions as additional descriptors using the same level of chemical information. To establish this, five different toxicity data sets, for which QSAR models were reported previously, have been considered in the q-RASAR modeling exercise, which uses chemical similarity-derived measures. The identical sets of chemical features along with the same compositions of training and test sets as reported previously were used in the present analysis for ease of comparison. The RASAR descriptors were calculated based on a chosen similarity measure with the default setting of relevant hyperparameter(s) and were then clubbed with the original structural and physicochemical descriptors, and the number of selected features was further optimized by employing a grid search technique applied on the respective training sets. These features were then used to develop multiple linear regression (MLR) q-RASAR models that show enhanced predictivity as compared to the QSAR models developed previously. Moreover, various other ML algorithms like support vector machine (SVM), linear SVM, random forest, partial least squares, and ridge regression were also employed using the same feature combinations as used in the MLR models to compare the prediction qualities. The q-RASAR models for five different data sets possess at least one of the RASAR descriptors, RA function, gm, and average similarity, suggesting that these are important determinants of similarities that contribute to the development of predictive q-RASAR models, as also evident from the SHAP analysis of the models.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
华仔应助wenwen采纳,获得10
1秒前
叶95完成签到 ,获得积分10
2秒前
2秒前
qinghe完成签到,获得积分10
2秒前
姣姣发布了新的文献求助10
3秒前
HHH完成签到,获得积分10
5秒前
XHS完成签到,获得积分10
6秒前
6秒前
6秒前
Orange应助泥怎么睡得着的采纳,获得10
9秒前
one发布了新的文献求助10
11秒前
12秒前
科研通AI5应助油糕饵块采纳,获得10
14秒前
ding应助无私语儿采纳,获得10
14秒前
垣味栗子酱完成签到,获得积分10
15秒前
orixero应助勤恳的嚓茶采纳,获得10
15秒前
15秒前
澜斐完成签到,获得积分10
15秒前
17秒前
喜悦的皮卡丘完成签到,获得积分10
17秒前
赘婿应助王怀樟采纳,获得10
19秒前
清脆不斜应助啦啦啦采纳,获得30
19秒前
yanshapo发布了新的文献求助10
20秒前
泥怎么睡得着的完成签到,获得积分20
20秒前
21秒前
22秒前
22秒前
科研通AI5应助大力便当采纳,获得10
23秒前
24秒前
24秒前
25秒前
yanshapo完成签到,获得积分10
25秒前
25秒前
26秒前
26秒前
科研通AI5应助东郭凝蝶采纳,获得10
26秒前
28秒前
搜集达人应助姜豆姜采纳,获得30
29秒前
29秒前
温暖幻桃发布了新的文献求助10
30秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Production Logging: Theoretical and Interpretive Elements 3000
Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500
CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000
J'AI COMBATTU POUR MAO // ANNA WANG 660
Izeltabart tapatansine - AdisInsight 600
Introduction to Comparative Public Administration Administrative Systems and Reforms in Europe, Third Edition 3rd edition 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3752547
求助须知:如何正确求助?哪些是违规求助? 3296091
关于积分的说明 10092821
捐赠科研通 3010979
什么是DOI,文献DOI怎么找? 1653508
邀请新用户注册赠送积分活动 788267
科研通“疑难数据库(出版商)”最低求助积分说明 752789