Motif-based Graph Self-Supervised Learning for Molecular Property Prediction

计算机科学 主题(音乐) 生成语法 人工智能 机器学习 分子图 图形 可扩展性 理论计算机科学 声学 数据库 物理
作者
Zaixi Zhang,Qi Liu,Hao Wang,Chengqiang Lu,Chee‐Kong Lee
出处
期刊:Cornell University - arXiv 被引量:66
标识
DOI:10.48550/arxiv.2110.00987
摘要

Predicting molecular properties with data-driven methods has drawn much attention in recent years. Particularly, Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks. In cases where labeled data is scarce, GNNs can be pre-trained on unlabeled molecular data to first learn the general semantic and structural information before being fine-tuned for specific tasks. However, most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks. These approaches cannot capture the rich information in subgraphs or graph motifs. For example, functional groups (frequently-occurred subgraphs in molecular graphs) often carry indicative information about the molecular properties. To bridge this gap, we propose Motif-based Graph Self-supervised Learning (MGSSL) by introducing a novel self-supervised motif generation framework for GNNs. First, for motif extraction from molecular graphs, we design a molecule fragmentation method that leverages a retrosynthesis-based algorithm BRICS and additional rules for controlling the size of motif vocabulary. Second, we design a general motif-based generative pre-training framework in which GNNs are asked to make topological and label predictions. This generative framework can be implemented in two different ways, i.e., breadth-first or depth-first. Finally, to take the multi-scale information in molecular graphs into consideration, we introduce a multi-level self-supervised pre-training. Extensive experiments on various downstream benchmark tasks show that our methods outperform all state-of-the-art baselines.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
善学以致用应助坦率的傥采纳,获得10
刚刚
mayzee完成签到,获得积分10
1秒前
1秒前
1秒前
2秒前
foolishbear发布了新的文献求助30
2秒前
罗雪莉发布了新的文献求助10
3秒前
3秒前
CodeCraft应助阿扎尔采纳,获得10
3秒前
4秒前
乐乐应助kk采纳,获得20
4秒前
传奇3应助杨大葱采纳,获得10
5秒前
xiaobo完成签到,获得积分10
6秒前
6秒前
6秒前
Ir发布了新的文献求助30
7秒前
田文强完成签到 ,获得积分10
8秒前
000000发布了新的文献求助10
9秒前
Ava应助乐干面采纳,获得10
9秒前
lyu完成签到,获得积分10
9秒前
huminjie完成签到 ,获得积分10
9秒前
善良的火发布了新的文献求助30
9秒前
张凯完成签到,获得积分10
11秒前
可乐完成签到,获得积分10
11秒前
量子星尘发布了新的文献求助10
11秒前
爆米花应助qing采纳,获得10
11秒前
cll发布了新的文献求助10
11秒前
12秒前
科研通AI6.2应助12345采纳,获得10
12秒前
16秒前
爆米花应助自由面包采纳,获得10
17秒前
科研通AI6.4应助mzc采纳,获得10
18秒前
yuxingyao发布了新的文献求助10
19秒前
fan_alive完成签到,获得积分10
20秒前
不锈钢臭宝宝完成签到,获得积分10
21秒前
21秒前
21秒前
英俊的铭应助张锐斌采纳,获得10
22秒前
Orange应助小树苗采纳,获得10
23秒前
爆米花应助cll采纳,获得10
23秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Aerospace Standards Index - 2026 ASIN2026 3000
Relation between chemical structure and local anesthetic action: tertiary alkylamine derivatives of diphenylhydantoin 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
Principles of town planning : translating concepts to applications 500
Work Engagement and Employee Well-being 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6068576
求助须知:如何正确求助?哪些是违规求助? 7900683
关于积分的说明 16331080
捐赠科研通 5210106
什么是DOI,文献DOI怎么找? 2786749
邀请新用户注册赠送积分活动 1769656
关于科研通互助平台的介绍 1647925