下部结构
计算机科学
鉴定(生物学)
财产(哲学)
人工智能
聚类分析
一致性(知识库)
可视化
代表(政治)
机器学习
数据挖掘
哲学
植物
结构工程
认识论
工程类
生物
政治
法学
政治学
作者
Gaoqi He,Liu Shun,Zhuoran Liu,Changbo Wang,Kai Zhang,Honglin Li
摘要
Abstract Substructure-based representation learning has emerged as a powerful approach to featurize complex attributed graphs, with promising results in molecular property prediction (MPP). However, existing MPP methods mainly rely on manually defined rules to extract substructures. It remains an open challenge to adaptively identify meaningful substructures from numerous molecular graphs to accommodate MPP tasks. To this end, this paper proposes Prototype-based cOntrastive Substructure IdentificaTion (POSIT), a self-supervised framework to autonomously discover substructural prototypes across graphs so as to guide end-to-end molecular fragmentation. During pre-training, POSIT emphasizes two key aspects of substructure identification: firstly, it imposes a soft connectivity constraint to encourage the generation of topologically meaningful substructures; secondly, it aligns resultant substructures with derived prototypes through a prototype-substructure contrastive clustering objective, ensuring attribute-based similarity within clusters. In the fine-tuning stage, a cross-scale attention mechanism is designed to integrate substructure-level information to enhance molecular representations. The effectiveness of the POSIT framework is demonstrated by experimental results from diverse real-world datasets, covering both classification and regression tasks. Moreover, visualization analysis validates the consistency of chemical priors with identified substructures. The source code is publicly available at https://github.com/VRPharmer/POSIT.
科研通智能强力驱动
Strongly Powered by AbleSci AI