一般化
虚假关系
联营
分类器(UML)
因果模型
计算机科学
突出
图形
特征(语言学)
人工智能
因果结构
机器学习
数学
数据挖掘
理论计算机科学
统计
数学分析
语言学
哲学
物理
量子力学
作者
Yongduo Sui,Wenyu Mao,Shuyao Wang,Xiang Wang,Jiancan Wu,Xiangnan He,Tat‐Seng Chua
出处
期刊:ACM Transactions on Knowledge Discovery From Data
[Association for Computing Machinery]
日期:2024-02-06
卷期号:18 (5): 1-24
摘要
In graph classification, attention- and pooling-based graph neural networks (GNNs) predominate to extract salient features from the input graph and support the prediction. They mostly follow the paradigm of “learning to attend,” which maximizes the mutual information between the attended graph and the ground-truth label. However, this paradigm causes GNN classifiers to indiscriminately absorb all statistical correlations between input features and labels in the training data without distinguishing the causal and noncausal effects of features. Rather than emphasizing causal features, the attended graphs tend to rely on noncausal features as shortcuts to predictions. These shortcut features may easily change outside the training distribution, thereby leading to poor generalization for GNN classifiers. In this article, we take a causal view on GNN modeling. Under our causal assumption, the shortcut feature serves as a confounder between the causal feature and prediction. It misleads the classifier into learning spurious correlations that facilitate prediction in in-distribution (ID) test evaluation while causing significant performance drop in out-of-distribution (OOD) test data. To address this issue, we employ the backdoor adjustment from causal theory—combining each causal feature with various shortcut features, to identify causal patterns and mitigate the confounding effect. Specifically, we employ attention modules to estimate the causal and shortcut features of the input graph. Then, a memory bank collects the estimated shortcut features, enhancing the diversity of shortcut features for combination. Simultaneously, we apply the prototype strategy to improve the consistency of intra-class causal features. We term our method as CAL+, which can promote stable relationships between causal estimation and prediction, regardless of distribution changes. Extensive experiments on synthetic and real-world OOD benchmarks demonstrate our method’s effectiveness in improving OOD generalization. Our codes are released at https://github.com/shuyao-wang/CAL-plus .
科研通智能强力驱动
Strongly Powered by AbleSci AI