数量结构-活动关系
适用范围
偏最小二乘回归
支持向量机
Boosting(机器学习)
分子描述符
药物发现
试验装置
交叉验证
人工智能
多元统计
化学
计算机科学
线性回归
生物系统
机器学习
生物
生物化学
作者
Sheng Wang,Jie Dong,Yin-Hua Deng,Minfeng Zhu,Ming Wen,Zhi‐Jiang Yao,Aiping Lü,Jianbing Wang,Dongsheng Cao
标识
DOI:10.1021/acs.jcim.5b00642
摘要
The Caco-2 cell monolayer model is a popular surrogate in predicting the in vitro human intestinal permeability of a drug due to its morphological and functional similarity with human enterocytes. A quantitative structure-property relationship (QSPR) study was carried out to predict Caco-2 cell permeability of a large data set consisting of 1272 compounds. Four different methods including multivariate linear regression (MLR), partial least-squares (PLS), support vector machine (SVM) regression and Boosting were employed to build prediction models with 30 molecular descriptors selected by nondominated sorting genetic algorithm-II (NSGA-II). The best Boosting model was obtained finally with R(2) = 0.97, RMSEF = 0.12, Q(2) = 0.83, RMSECV = 0.31 for the training set and RT(2) = 0.81, RMSET = 0.31 for the test set. A series of validation methods were used to assess the robustness and predictive ability of our model according to the OECD principles and then define its applicability domain. Compared with the reported QSAR/QSPR models about Caco-2 cell permeability, our model exhibits certain advantage in database size and prediction accuracy to some extent. Finally, we found that the polar volume, the hydrogen bond donor, the surface area and some other descriptors can influence the Caco-2 permeability to some extent. These results suggest that the proposed model is a good tool for predicting the permeability of drug candidates and to perform virtual screening in the early stage of drug development.
科研通智能强力驱动
Strongly Powered by AbleSci AI