计算机科学
正规化(语言学)
财产(哲学)
数据挖掘
机器学习
源代码
辍学(神经网络)
人工智能
一致性(知识库)
理论计算机科学
程序设计语言
哲学
认识论
作者
Dan Zhang,Wenzheng Feng,Yuandong Wang,Zhongang Qi,Ying Shan,Jie Tang
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:: 1-13
标识
DOI:10.1109/tkde.2023.3290032
摘要
Recently, molecular data mining has attracted a lot of attention owing to its great application potential in material and drug discovery. However, this mining task faces a challenge posed by the scarcity of labeled molecular graphs. To overcome this challenge, we introduce a novel data augmentation and a semi-supervised confidence-aware consistency regularization training framework for molecular property prediction. The core of our framework is a data augmentation strategy on molecular graphs, named DropConn (Dropout Connection). DropConn generates pseudo molecular graphs by softening the hard connections of chemical bonds (as edges), where the soft weights are calculated from edge features so that the adaptive interactions between different atoms can be incorporated. Besides, to enhance the model's generalization ability, a consistency regularization training strategy is proposed to take full advantage of massive unlabeled data. Furthermore, DropConn can serve as a plugin that can be seamlessly added to many existing models. Extensive experiments under both non-pre-training setting and fine-tuning setting demonstrate that DropConn can obtain superior performance (up to 8.22%) over state-of-the-art methods on molecular property prediction tasks. The code is available at https://github.com/THUDM/DropConn .
科研通智能强力驱动
Strongly Powered by AbleSci AI