嵌入
计算机科学
稳健性(进化)
杠杆(统计)
人工智能
理论计算机科学
知识图
数据挖掘
机器学习
生物化学
基因
化学
作者
Cheikh Brahim El Vaigh,François Torregrossa,Robin Allesiardo,Guillaume Gravier,Pascale Sébillot
标识
DOI:10.1109/ictai50040.2020.00148
摘要
Entity alignment is a crucial tool in knowledge discovery to reconcile knowledge from different sources. Recent state-of-the-art approaches leverage joint embedding of knowledge graphs (KGs) so that similar entities from different KGs are close in the embedded space. Whatever the joint embedding technique used, a seed set of aligned entities, often provided by (time-consuming) human expertise, is required to learn the joint KG embedding and/or a mapping between KG embeddings. In this context, a key issue is to limit the size and quality requirement for the seed. State-of-the-art methods usually learn the embedding by explicitly minimizing the distance between aligned entities from the seed and uniformly maximizing the distance for entities not in the seed. In contrast, we design a less restrictive optimization criterion that indirectly minimizes the distance between aligned entities in the seed by globally maximizing the dimension-wise correlation among all the embeddings of seed entities. Within an iterative entity alignment system, the correlation-based entity embedding function achieves state-of-the-art results and is shown to significantly increase robustness to the seed's size and accuracy. It ultimately enables fully unsupervised entity alignment using a seed automatically generated with a symbolic alignment method based on entities' names.
科研通智能强力驱动
Strongly Powered by AbleSci AI