计算机科学
杠杆(统计)
人工智能
特征学习
分子图
代表(政治)
财产(哲学)
图像(数学)
图形
相似性(几何)
模式识别(心理学)
机器学习
自然语言处理
理论计算机科学
认识论
哲学
政治
法学
政治学
作者
Hongxin Xiang,Shuting Jin,Xiangrong Liu,Xiangxiang Zeng,Li Zeng
摘要
Current methods of molecular image-based drug discovery face two major challenges: (1) work effectively in absence of labels, and (2) capture chemical structure from implicitly encoded images. Given that chemical structures are explicitly encoded by molecular graphs (such as nitrogen, benzene rings and double bonds), we leverage self-supervised contrastive learning to transfer chemical knowledge from graphs to images. Specifically, we propose a novel Contrastive Graph-Image Pre-training (CGIP) framework for molecular representation learning, which learns explicit information in graphs and implicit information in images from large-scale unlabeled molecules via carefully designed intra- and inter-modal contrastive learning. We evaluate the performance of CGIP on multiple experimental settings (molecular property prediction, cross-modal retrieval and distribution similarity), and the results show that CGIP can achieve state-of-the-art performance on all 12 benchmark datasets and demonstrate that CGIP transfers chemical knowledge in graphs to molecular images, enabling image encoder to perceive chemical structures in images. We hope this simple and effective framework will inspire people to think about the value of image for molecular representation learning.
科研通智能强力驱动
Strongly Powered by AbleSci AI