计算机科学
可解释性
分子图
编码
溶解度
人工智能
水准点(测量)
背景(考古学)
代表(政治)
编码(集合论)
数据挖掘
生物系统
机器学习
图形
理论计算机科学
化学
古生物学
生物化学
大地测量学
生物
政治
政治学
法学
基因
地理
有机化学
集合(抽象数据类型)
程序设计语言
作者
Ziyu Fan,Linying Chen,Xinyi Wu,Zhijian Huang,Lei Deng
标识
DOI:10.1109/jbhi.2024.3397493
摘要
Solubility is not only a significant physical property of molecules but also a vital factor in smallmolecule drug development. Determining drug solubility demands stringent equipment, controlled environments, and substantial human and material resources. The accurate prediction of drug solubility using computational methods has long been a goal for researchers. In this study, we introduce MSCSol, a solubility prediction model that integrates multidimensional molecular structure information. We incorporate a graph neural network with geometric vector perceptrons (GVP-GNN) to encode 3D molecular structures, representing spatial arrangement and orientation of atoms, as well as atomic sequences and interactions. We also employ Selective Kernel Convolution combined with Global and Local attention mechanisms to capture molecular features context at different scales. Additionally, various descriptors are calculated to enrich the molecular representation. For the 2D and 3D structural data of molecules, we design different data augmentation strategies to enhance generalization ability and prevent the model from learning irrelevant information. Extensive experiments on benchmark and independent datasets demonstrate MSCSol's superior performance. Ablation studies further confirm the effectiveness of different modules. Interpretability analysis highlights the importance of various atomic groups and substructures for solubility and verifies that our model effectively captures functional molecular structures and higher-order knowledge. The source code and datasets are freely available at https://github.com/ZiyuFanCSU/MSCSol .
科研通智能强力驱动
Strongly Powered by AbleSci AI