Systematic Modeling of log D7.4 Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis

适用范围 数量结构-活动关系 计算机科学 人工智能 分子描述符 稳健性(进化) 特征选择 机器学习 数学 化学 生物化学 基因
作者
Li Fu,Lu Liu,Zhi-Jiang Yang,Pan Li,Junjie Ding,Yong‐Huan Yun,Aiping Lü,Tingjun Hou,Dongsheng Cao
出处
期刊:Journal of Chemical Information and Modeling [American Chemical Society]
卷期号:60 (1): 63-76 被引量:40
标识
DOI:10.1021/acs.jcim.9b00718
摘要

Lipophilicity, as evaluated by the n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4), is a major determinant of various absorption, distribution, metabolism, elimination, and toxicology (ADMET) parameters of drug candidates. In this study, we developed several quantitative structure–property relationship (QSPR) models to predict log D7.4 based on a large and structurally diverse data set. Eight popular machine learning algorithms were employed to build the prediction models with 43 molecular descriptors selected by a wrapper feature selection method. The results demonstrated that XGBoost yielded better prediction performance than any other single model (RT2 = 0.906 and RMSET = 0.395). Moreover, the consensus model from the top three models could continue to improve the prediction performance (RT2 = 0.922 and RMSET = 0.359). The robustness, reliability, and generalization ability of the models were strictly evaluated by the Y-randomization test and applicability domain analysis. Moreover, the group contribution model based on 110 atom types and the local models for different ionization states were also established and compared to the global models. The results demonstrated that the descriptor-based consensus model is superior to the group contribution method, and the local models have no advantage over the global models. Finally, matched molecular pair (MMP) analysis and descriptor importance analysis were performed to extract transformation rules and give some explanations related to log D7.4. In conclusion, we believe that the consensus model developed in this study can be used as a reliable and promising tool to evaluate log D7.4 in drug discovery.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
花生完成签到 ,获得积分10
1秒前
lizef完成签到 ,获得积分10
8秒前
doclarrin完成签到 ,获得积分10
9秒前
小伙子完成签到,获得积分10
14秒前
诗蕊完成签到 ,获得积分10
16秒前
Hina完成签到,获得积分10
20秒前
alixy完成签到,获得积分10
22秒前
00完成签到 ,获得积分10
22秒前
22秒前
chi完成签到 ,获得积分10
27秒前
Murphy发布了新的文献求助30
27秒前
柚C美式完成签到 ,获得积分10
35秒前
36秒前
41秒前
北城完成签到 ,获得积分10
47秒前
miracle完成签到 ,获得积分10
48秒前
会飞的鱼完成签到,获得积分10
52秒前
多托郭完成签到 ,获得积分10
56秒前
李爱国应助可靠猕猴桃采纳,获得10
1分钟前
Lesterem完成签到 ,获得积分10
1分钟前
lamborghini193完成签到,获得积分10
1分钟前
执着凡梦发布了新的文献求助10
1分钟前
金轩完成签到 ,获得积分10
1分钟前
淡如水完成签到 ,获得积分10
1分钟前
1分钟前
1分钟前
西扬完成签到 ,获得积分10
1分钟前
sydhwo完成签到 ,获得积分10
1分钟前
CipherSage应助科研通管家采纳,获得10
1分钟前
1分钟前
现实的曼安完成签到 ,获得积分10
1分钟前
1分钟前
我就想看看文献完成签到 ,获得积分10
1分钟前
peterlzb1234567完成签到,获得积分10
1分钟前
njseu完成签到 ,获得积分10
1分钟前
独特觅翠完成签到 ,获得积分10
1分钟前
听话的白易完成签到,获得积分10
1分钟前
随便完成签到 ,获得积分10
2分钟前
缥缈的闭月完成签到,获得积分10
2分钟前
Tianju完成签到,获得积分10
2分钟前
高分求助中
Evolution 10000
ISSN 2159-8274 EISSN 2159-8290 1000
Becoming: An Introduction to Jung's Concept of Individuation 600
Ore genesis in the Zambian Copperbelt with particular reference to the northern sector of the Chambishi basin 500
A new species of Coccus (Homoptera: Coccoidea) from Malawi 500
A new species of Velataspis (Hemiptera Coccoidea Diaspididae) from tea in Assam 500
PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3162364
求助须知:如何正确求助?哪些是违规求助? 2813350
关于积分的说明 7899821
捐赠科研通 2472848
什么是DOI,文献DOI怎么找? 1316556
科研通“疑难数据库(出版商)”最低求助积分说明 631375
版权声明 602142