模式治疗法
计算机科学
隐喻
信息融合
人工智能
融合
多通道交互
自然语言处理
情报检索
模式识别(心理学)
人机交互
心理学
语言学
心理治疗师
哲学
作者
Xiaoyu He,Long Yu,Shengwei Tian,Qimeng Yang,Jun Long,Bo Wang
标识
DOI:10.1016/j.ipm.2024.103652
摘要
In this paper, we study multimodal metaphor detection to obtain real semantic meaning from multiple heterogeneous information sources. The existing approaches mainly suffer from two drawbacks. (1) They focus on textual aspects, overlooking the characteristics of visual metaphor information. (2) Efficient methods for fusing multimodal metaphor features are lacking. To address the first issue, we propose a visual information enhancement method based on dual-granularity visual feature fusion, obtaining complete metaphorical visual features. To achieve bidirectional interaction among multimodal metaphor features, we further develop a multi-interactive crossmodal residual network (MCRN) that fuses the consistent and complementary information between different modalities and design a progressive fusion strategy to enhance the iterative fusion ability of the model. We extensively evaluate the proposed method on the popular Met-meme metaphor detection benchmark, outperforming the existing state-of-the-art methods by a large margins; i.e., we achieve F1 score improvements ranging from 1.47% to 2.55% under different languages. In addition, we further extend the evaluation to the Sarcasm dataset to validate the ability of the model to perceive semantic contrasts and meaning transformations, and the experimental results are superior to those of a strong baseline model.
科研通智能强力驱动
Strongly Powered by AbleSci AI