情态动词
计算机科学
图形
水准点(测量)
人工智能
过程(计算)
算法
理论计算机科学
化学
操作系统
地理
大地测量学
高分子化学
作者
Zihao Zheng,Tao He,Ming Liu,Zhongyuan Wang,Ruiji Fu,Bing Qin
标识
DOI:10.1109/icassp48485.2024.10448507
摘要
Multi-modal relation extraction (MRE) requires the integration of multi-modal information to identify relationships between entities. Although fine-grained correlations between visual objects and textual words have the potential to improve cross-modal interaction, they are typically modeled implicitly and hindered by the modality gap. This paper introduces a novel method called relational Graph-Bridged cross-modal InTeraction (GBIT). GBIT aims to model fine-grained cross-modal correlations into the interaction process explicitly. This is achieved by constructing a fine-grained cross-modal relational graph, which acts as a bridge for effective cross-modal interaction in multiple layers. Within GBIT, a gated interaction strategy and an adaptive integration module are proposed for irrelevance-filtered information exchange and final information collation. Through extensive experiments on the benchmark MRE, we demonstrate the superiority of our proposed method for MRE.
科研通智能强力驱动
Strongly Powered by AbleSci AI