化学
分子描述符
极地的
渗透
生物信息学
极表面积
氢键
磁导率
特征选择
数据集
膜
数量结构-活动关系
生物系统
分子
立体化学
人工智能
统计
生物化学
有机化学
数学
天文
物理
基因
生物
计算机科学
作者
Hanne H. F. Refsgaard,Berith F. Jensen,Per B. Brockhoff,Søren Berg Padkjær,Mette Guldbrandt,Michael Christensen
摘要
A data set consisting of 712 compounds was used for classification into two classes with respect to membrane permeation in a cell-based assay: (0) apparent permeability (P(app)) below 4 x 10(-6) cm/s and (1) P(app) on 4 x 10(-6) cm/s or higher. Nine molecular descriptors were calculated for each compound and Nearest-Neighbor classification was applied using five neighbors as optimized by full cross-validation. A model based on five descriptors, number of flex bonds, number of hydrogen bond acceptors and donors, and molecular and polar surface area, was selected by variable selection. In an external test set of 112 compounds, 104 compounds were classified and 8 compounds were judged as "unknown". Among the 104 compounds, 16 were misclassified corresponding to a misclassification rate of 15% and no compounds were falsely predicted in the nonpermeable class.
科研通智能强力驱动
Strongly Powered by AbleSci AI