计算机科学
变压器
知识图
人工智能
图形
融合
自然语言处理
情报检索
理论计算机科学
机器学习
语言学
哲学
物理
量子力学
电压
作者
Jingchao Wang,X Liu,Weimin Li,Fangfang Liu,Xing Wu,Qun Jin
标识
DOI:10.1109/mis.2024.3378921
摘要
Multimodal knowledge graphs (MKGs) organize multimodal facts in the form of entities and relations, and have been successfully applied to several downstream tasks. Since most MKGs are incomplete, the MKG completion (MKGC) task has been proposed to address this problem, which aims to complete missing entities in MKGs. Previous most works obtain reasoning ability by capturing the correlation between target triplets and related images, but they ignore contextual semantic information and the reasoning process is not easily explainable. To address these issues, we propose a novel text-enhanced transformer fusion network called TE-TFN, which converts the context path between head and tail entities into natural language text and fuses multimodal features from both coarse and fine granularities through a multi-granularity fuser. It not only effectively enhances text semantic information, but also improves the interpretability of the model by introducing paths. Experimental results on benchmark datasets demonstrate the effectiveness of our model.
科研通智能强力驱动
Strongly Powered by AbleSci AI