可解释性
计算机科学
机器学习
冗余(工程)
排名(信息检索)
人工智能
一般化
数据挖掘
图形
虚拟筛选
理论计算机科学
药物发现
数学
生物信息学
数学分析
生物
操作系统
作者
Duanhua Cao,Geng Chen,Jiaxin Jiang,Jie Yu,Runze Zhang,Mingan Chen,Wensheng Zhang,Lifan Chen,Feisheng Zhong,Yingying Zhang,Chenghao Lu,Xutong Li,Xiaomin Luo,Sulin Zhang,Mingyue Zheng
标识
DOI:10.1101/2023.06.18.545464
摘要
ABSTRACT Developing robust methods for evaluating protein-ligand interactions has been a long-standing problem. Here, we propose a novel approach called EquiScore, which utilizes an equivariant heterogeneous graph neural network to integrate physical prior knowledge and characterize protein-ligand interactions in equivariant geometric space. To improve generalization performance, we constructed a dataset called PDBscreen and designed multiple data augmentation strategies suitable for training scoring methods. We also analyzed potential risks of data leakage in commonly used data-driven modeling processes and proposed a more stringent redundancy removal scheme to alleviate this problem. On two large external test sets, EquiScore outperformed 21 methods across a range of screening performance metrics, and this performance was insensitive to binding pose generation methods. EquiScore also showed good performance on the activity ranking task of a series of structural analogs, indicating its potential to guide lead compound optimization. Finally, we investigated different levels of interpretability of EquiScore, which may provide more insights into structure-based drug design.
科研通智能强力驱动
Strongly Powered by AbleSci AI