计算机科学
编码
图形
理论计算机科学
聚类分析
特征学习
节点(物理)
聚类系数
数据挖掘
人工智能
生物化学
结构工程
基因
工程类
化学
作者
Costas Mavromatis,George Karypis
出处
期刊:Cornell University - arXiv
日期:2020-01-01
被引量:14
标识
DOI:10.48550/arxiv.2009.06946
摘要
Unsupervised (or self-supervised) graph representation learning is essential to facilitate various graph data mining tasks when external supervision is unavailable. The challenge is to encode the information about the graph structure and the attributes associated with the nodes and edges into a low dimensional space. Most existing unsupervised methods promote similar representations across nodes that are topologically close. Recently, it was shown that leveraging additional graph-level information, e.g., information that is shared among all nodes, encourages the representations to be mindful of the global properties of the graph, which greatly improves their quality. However, in most graphs, there is significantly more structure that can be captured, e.g., nodes tend to belong to (multiple) clusters that represent structurally similar nodes. Motivated by this observation, we propose a graph representation learning method called Graph InfoClust (GIC), that seeks to additionally capture cluster-level information content. These clusters are computed by a differentiable K-means method and are jointly optimized by maximizing the mutual information between nodes of the same clusters. This optimization leads the node representations to capture richer information and nodal interactions, which improves their quality. Experiments show that GIC outperforms state-of-art methods in various downstream tasks (node classification, link prediction, and node clustering) with a 0.9% to 6.1% gain over the best competing approach, on average.
科研通智能强力驱动
Strongly Powered by AbleSci AI