Generalizing Graph Neural Networks on Out-of-Distribution Graphs

虚假关系计算机科学人工智能联营机器学习利用因果推理推论图形因果模型超参数理论计算机科学数学计量经济学计算机安全统计

作者

Shaohua Fan,Xiao Wang,Chuan Shi,Peng Cui,Bai Wang

出处

期刊：IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
日期：2024-01-01 卷期号：46 (1): 322-337 被引量：11

链接

nih.govdoi.org

标识

DOI：10.1109/tpami.2023.3321097

摘要

Graph Neural Networks (GNNs) are proposed without considering the agnostic distribution shifts between training graphs and testing graphs, inducing the degeneration of the generalization ability of GNNs in Out-Of-Distribution (OOD) settings. The fundamental reason for such degeneration is that most GNNs are developed based on the I.I.D hypothesis. In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation. This learning mechanism inherits from the common characteristics of machine learning approaches. However, such spurious correlations may change in the wild testing environments, leading to the failure of GNNs. Therefore, eliminating the impact of spurious correlations is crucial for stable GNN models. To this end, in this paper, we argue that the spurious correlation exists among subgraph-level units and analyze the degeneration of GNN in causal view. Based on the causal view analysis, we propose a general causal representation framework for stable GNN, called StableGNN. The main idea of this framework is to extract high-level representations from raw graph data first and resort to the distinguishing ability of causal inference to help the model get rid of spurious correlations. Particularly, to extract meaningful high-level representations, we exploit a differentiable graph pooling layer to extract subgraph-based representations by an end-to-end manner. Furthermore, inspired by the confounder balancing techniques from causal inference, based on the learned high-level representations, we propose a causal variable distinguishing regularizer to correct the biased training distribution by learning a set of sample weights. Hence, GNNs would concentrate more on the true connection between discriminative substructures and labels. Extensive experiments are conducted on both synthetic datasets with various distribution shift degrees and eight real-world OOD graph datasets. The results well verify that the proposed model StableGNN not only outperforms the state-of-the-arts but also provides a flexible framework to enhance existing GNNs. In addition, the interpretability experiments validate that StableGNN could leverage causal structures for predictions.

求助该文献

最长约 10秒，即可获得该文献文件

Generalizing Graph Neural Networks on Out-of-Distribution Graphs

今日热心研友