讽刺
计算机科学
人工智能
图同构
同构(结晶学)
自然语言处理
图形
理论计算机科学
语言学
讽刺
哲学
折线图
结晶学
晶体结构
化学
出处
期刊:The Electronic Library
[Emerald (MCB UP)]
日期:2025-02-13
标识
DOI:10.1108/el-07-2024-0198
摘要
Purpose Previous research mainly uses graph neural networks on syntactic dependency graphs, often neglecting emotional cues in sarcasm detection and failing to integrate image features for multimodal information effectively. To address these limitations, this study proposes a novel multimodal sarcasm detection model based on the directed graph isomorphism network with sentiment enhancement and multimodal fusion (DGIN-SE-MF). Design/methodology/approach The approach extracts image and text features through vision transformer and BERT, respectively. To deeply integrate the extracted features, the author develops a text-guided multi-head attention fusion mechanism module. Subsequently, a directed graph is constructed through SE and the multimodal factorized bilinear pooling method to integrate image features into the graph. The DGIN then fuses the image and text features, using a weighted attention mechanism to generate the final representation. Findings The model is validated on three datasets: English, Chinese and an Indonesian–English dataset. The results demonstrate that the proposed model consistently outperforms other baseline models, particularly on the Chinese and English sarcasm datasets, achieving F1 scores of 88.75 % and 83.10 %, respectively. Originality/value The proposed model addresses the inadequacies of previous methods by effectively integrating emotional cues and image features into sarcasm detection. To the best of the authors’ knowledge, this is the first work to leverage a DGIN-SE-MF for this task, leading to significant improvements in detection performance across different languages.
科研通智能强力驱动
Strongly Powered by AbleSci AI