计算机科学
异常检测
图形
知识图
估计
人工智能
情报检索
数据挖掘
机器学习
理论计算机科学
管理
经济
作者
Yunfeng Zhou,Cui Zhu,Wenjun Zhu
标识
DOI:10.1016/j.ipm.2024.103705
摘要
Knowledge graphs (KGs) have found extensive applications within intelligent systems, such as information retrieval. Much of the research has predominantly focused on completing missing knowledge, with little consideration given to examining errors. Unfortunately, during customizing KGs, diverse unpredictable errors are virtually unavoidable to be introduced, and these anomalies significantly impact the performance of applications. Detecting erroneous knowledge presents a formidable challenge due to the costly acquisition of ground-truth labels. In this work, we develop an unsupervised anomaly detection framework named ProMvSD, aiming to adapt KGs of varying scales via serialization components. To overcome the insufficient contextual information provided by the topological structure, we introduce the large language model as a reasoner to extract prior knowledge from extensive pre-trained textual data, thereby enhancing the understanding of KGs. Anomalous triple may result in a larger semantic gap between the head and tail neighborhoods. To uncover latent anomalies effectively, we propose a multi-view semantic-driven model (MvSD) based on the assumptions of self-consistency and information stability. MvSD jointly estimates the suspiciousness of triples from three hyperviews: node-view semantic contradiction, triple-view semantic gap, and pathway-view semantic gap. Extensive experiments on three English benchmark KGs and a Chinese medical KG demonstrate that, for the top 1% of the most suspicious triples, we can detect real anomalies with at most 99.9% accuracy. Furthermore, ProMvSD significantly outperforms state-of-the-art representation learning baselines, achieving a 29.2% improvement in detecting all anomalies.
科研通智能强力驱动
Strongly Powered by AbleSci AI