分子图
可解释性
计算机科学
编码(内存)
编码
图形
代表(政治)
财产(哲学)
分子模型
人工智能
数据挖掘
理论计算机科学
化学
立体化学
基因
政治学
生物化学
法学
哲学
认识论
政治
作者
Yusheng Hao,Xing Chen,Ailu Fei,Qifeng Jia,Yu Chen,Jinsong Shao,S. Sundara Pandiyan,Li Wang
出处
期刊:Molecules
[MDPI AG]
日期:2024-01-19
卷期号:29 (2): 492-492
标识
DOI:10.3390/molecules29020492
摘要
Existing formats based on the simplified molecular input line entry system (SMILES) encoding and molecular graph structure are designed to encode the complete semantic and structural information of molecules. However, the physicochemical properties of molecules are complex, and a single encoding of molecular features from SMILES sequences or molecular graph structures cannot adequately represent molecular information. Aiming to address this problem, this study proposes a sequence graph cross-attention (SG-ATT) representation architecture for a molecular property prediction model to efficiently use domain knowledge to enhance molecular graph feature encoding and combine the features of molecular SMILES sequences. The SG-ATT fuses the two-dimensional molecular features so that the current model input molecular information contains molecular structure information and semantic information. The SG-ATT was tested on nine molecular property prediction tasks. Among them, the biggest SG-ATT model performance improvement was 4.5% on the BACE dataset, and the average model performance improvement was 1.83% on the full dataset. Additionally, specific model interpretability studies were conducted to showcase the performance of the SG-ATT model on different datasets. In-depth analysis was provided through case studies of in vitro validation. Finally, network tools for molecular property prediction were developed for the use of researchers.
科研通智能强力驱动
Strongly Powered by AbleSci AI