数量结构-活动关系
生物信息学
适用范围
分子描述符
厕所
试验装置
预测建模
交叉验证
训练集
药物发现
机器学习
集合(抽象数据类型)
人血浆
计算机科学
化学
人工智能
计算生物学
生物系统
色谱法
生物
生物化学
基因
程序设计语言
作者
Lixia Sun,Hong-Chang Yang,Jie Li,Tianduanyi Wang,Weihua Li,Guixia Liu,Yun Tang
出处
期刊:ChemMedChem
[Wiley]
日期:2017-11-10
卷期号:13 (6): 572-581
被引量:55
标识
DOI:10.1002/cmdc.201700582
摘要
Plasma protein binding (PPB) is a significant pharmacokinetic property of compounds in drug discovery and design. Due to the high cost and time-consuming nature of experimental assays, in silico approaches have been developed to assess the binding profiles of chemicals. However, because of unambiguity and the lack of uniform experimental data, most available predictive models are far from satisfactory. In this study, an elaborately curated training set containing 967 diverse pharmaceuticals with plasma-protein-bound fractions (fb ) was used to construct quantitative structure-activity relationship (QSAR) models by six machine learning algorithms with 26 molecular descriptors. Furthermore, we combined all of the individual learners to yield consensus prediction, marginally improving the accuracy of the consensus model. The model performance was estimated by tenfold cross validation and three external validation sets comprising 242 pharmaceutical, 397 industrial, and 231 newly designed chemicals, respectively. The models showed excellent performance for the entire test set, with mean absolute error (MAE) ranging from 0.126 to 0.178, demonstrating that our models could be used by a chemist when drawing a molecular structure from scratch. Meanwhile, structural descriptors contributing significantly to the predictive power of the models were related to the binding mechanisms, and the trend in terms of their effects on PPB can serve as guidance for the structural modification of chemicals. The applicability domain was also defined to distinguish favorable predictions from unfavorable predictions.
科研通智能强力驱动
Strongly Powered by AbleSci AI