Structure-Aware Multimodal Deep Learning for Drug–Protein Interaction Prediction

计算机科学水准点（测量）人工智能机器学习代表（政治）人工神经网络数据挖掘均方误差数据集特征学习蛋白质结构预测药物发现试验装置深度学习图形集合（抽象数据类型）训练集模式识别（心理学）蛋白质结构生物信息学理论计算机科学数学法学程序设计语言地理统计物理大地测量学政治生物核磁共振政治学

作者

Penglei Wang,Shuangjia Zheng,Yize Jiang,Chengtao Li,Junhong Liu,Chang Wen,Atanas Patronov,Dahong Qian,Hongming Chen,Yuedong Yang

出处

期刊：Journal of Chemical Information and Modeling [American Chemical Society]
日期：2022-02-24 卷期号：62 (5): 1308-1317 被引量：46

链接

nih.govdoi.org

标识

DOI：10.1021/acs.jcim.2c00060

摘要

Identifying drug–protein interactions (DPIs) is crucial in drug discovery, and a number of machine learning methods have been developed to predict DPIs. Existing methods usually use unrealistic data sets with hidden bias, which will limit the accuracy of virtual screening methods. Meanwhile, most DPI prediction methods pay more attention to molecular representation but lack effective research on protein representation and high-level associations between different instances. To this end, we present the novel structure-aware multimodal deep DPI prediction model, STAMP-DPI, which was trained on a curated industry-scale benchmark data set. We built a high-quality benchmark data set named GalaxyDB for DPI prediction. This industry-scale data set along with an unbiased training procedure resulted in a more robust benchmark study. For informative protein representation, we constructed a structure-aware graph neural network method from the protein sequence by combining predicted contact maps and graph neural networks. Through further integration of structure-based representation and high-level pretrained embeddings for molecules and proteins, our model effectively captures the feature representation of the interactions between them. As a result, STAMP-DPI outperformed state-of-the-art DPI prediction methods by decreasing 7.00% mean square error (MSE) in the Davis data set and improving 8.89% area under the curve (AUC) in the GalaxyDB data set. Moreover, our model is an interpretable model with the transformer-based interaction mechanism, which can accurately reveal the binding sites between molecules and proteins.

求助该文献

最长约 10秒，即可获得该文献文件

Structure-Aware Multimodal Deep Learning for Drug–Protein Interaction Prediction

今日热心研友