Exploring Semantic Relations for Social Media Sentiment Analysis
计算机科学
情绪分析
图像(数学)
名词
形容词
人工智能
情报检索
自然语言处理
作者
Jiandian Zeng,Jiantao Zhou,Caishi Huang
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing [Institute of Electrical and Electronics Engineers] 日期:2023-01-01卷期号:31: 2382-2394被引量:8
标识
DOI:10.1109/taslp.2023.3285238
摘要
With the massive social media data available online, the conventional single modality emotion classification has developed into more complex models of multimodal sentiment analysis. Most existing works simply extracted image features at a coarse level, resulting in the absence of partially detailed visual features. Besides, social media data usually contain multiple images, while existing works considered a single image case and used only one image for representing visual features. In fact, it is nontrivial to extend the single image case to the multiple images case, due to the complex relations among multiple images. To solve the above issues, in this paper, we propose a G ated F usion S emantic R elation (GFSR) network to explore semantic relations for social media sentiment analysis. In addition to inter-relations between visual and textual modalities, we also exploit intra-relations among multiple images, potentially improving the sentiment analysis performance. Specifically, we design a gated fusion network to fuse global image embeddings and the corresponding local Adjective Noun Pair (ANP) embeddings. Then, apart from textual relations and cross-modal relations, we employ the multi-head cross attention mechanism between images and ANPs to capture similar semantic contents. Eventually, the updated textual and visual representations are concatenated for the final sentiment prediction. Extensive experiments are conducted on real-world Yelp and Flickr30k datasets, showing that our GFSR can improve about 0.10% to 3.66% in terms of accuracy on the Yelp dataset with multiple images, and achieve the best accuracy for two classes and the best macro F1 for three classes on the Flickr30k dataset with a single image.