计算机科学
变压器
融合
模式治疗法
人工智能
心理学
工程类
语言学
心理治疗师
电气工程
哲学
电压
作者
Xiaoming Zhang,Meng Kaikai,Huiyong Wang
标识
DOI:10.1142/s021819402450013x
摘要
Multimodal entity linking aims to link mentions to target entities in the multimodal knowledge graph. The current multimodal entity linking mainly focuses on the global fusion of text and image, seldom fully exploring the correlation between modalities. In order to improve the fusion effect of multimodal feature, we propose a multimodal entity linking model based on a Multimodal Co-Attention Fusion strategy. This strategy is designed to enable text and image to guide each other for extracting features, thus making full exploration of the correlation between modalities to improve the fine-grained feature fusion effect. Furthermore, we also design a candidate entity generation strategy based on Transformer, which combines multiple candidate entity sets and adjusts the candidate entity ranking to obtain high-quality candidate entity sets. We perform experiments on domain datasets and public datasets, and the experimental results demonstrate that our model has a good performance in candidate entity generation and multimodal feature fusion, outperforming the state-of-the-art baseline models.
科研通智能强力驱动
Strongly Powered by AbleSci AI