虚拟筛选
计算机科学
人工智能
人工神经网络
机器学习
交叉验证
深度学习
水准点(测量)
训练集
接收机工作特性
预测能力
曲线下面积
生物信息学
药物发现
生物
哲学
大地测量学
认识论
药代动力学
地理
作者
Dixin Zhou,Fei Liu,Yiwen Zheng,Liangjian Hu,Tao Huang,Yu S. Huang
标识
DOI:10.1016/j.compbiomed.2022.106323
摘要
Deep learning-based virtual screening methods have been shown to significantly improve the accuracy of traditional docking-based virtual screening methods. In this paper, we developed Deffini, a structure-based virtual screening neural network model. During training, Deffini learns protein-ligand docking poses to distinguish actives and decoys and then to predict whether a new ligand will bind to the protein target. Deffini outperformed Smina with an average AUC ROC of 0.92 and AUC PRC of 0.44 in 3-fold cross-validation on the benchmark dataset DUD-E. However, when tested on the maximum unbiased validation (MUV) dataset, Deffini achieved poor results with an average AUC ROC of 0.517. We used the family-specific training approach to train the model to improve the model performance and concluded that family-specific models performed better than the pan-family models. To explore the limits of the predictive power of the family-specific models, we constructed Kernie, a new protein kinase dataset consisting of 358 kinases. Deffini trained with the Kernie dataset outperformed all recent benchmarks on the MUV kinases, with an average AUC ROC of 0.745, which highlights the importance of quality datasets in improving the performance of deep neural network models and the importance of using family-specific models.
科研通智能强力驱动
Strongly Powered by AbleSci AI