Multimodal Graph Contrastive Learning for Multimedia-Based Recommendation

计算机科学图形情报检索偏好学习推荐系统偏爱人工智能自然语言处理多媒体人机交互机器学习理论计算机科学经济微观经济学

作者

Kang Liu,Feng Xue,Dan Guo,Peijie Sun,Shengsheng Qian,Richang Hong

出处

期刊：IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
日期：2023-01-01 卷期号：25: 9343-9355 被引量：62

标识

DOI：10.1109/tmm.2023.3251108

摘要

Multimedia-based recommendation is a challenging task that requires not only learning collaborative signals from user-item interaction, but also capturing modality-specific user interest clues from complex multimedia content. Though significant progress on this challenge has been made, we argue that current solutions remain limited by multimodal noise contamination. Specifically, a considerable proportion of multimedia content is irrelevant to the user preference, such as the background, overall layout, and brightness of images; the word order and semantic-free words in titles; etc . We take this irrelevant information as noise contamination to discover user preferences. Moreover, most recent research has been conducted by graph learning. This means that noise is diffused into the user and item representations with the message propagation; the contamination influence is further amplified. To tackle this problem, we develop a novel framework named Multimodal Graph Contrastive Learning (MGCL), which captures collaborative signals from interactions and uses visual and textual modalities to respectively extract modality-specific user preference clues. The key idea of MGCL involves two aspects: First, to alleviate noise contamination during graph learning, we construct three parallel graph convolution networks to independently generate three types of user and item representations, containing collaborative signals, visual preference clues, and textual preference clues. Second, to eliminate as much preference-independent noisy information as possible from the generated representations, we incorporate sufficient self-supervised signals into the model optimization with the help of contrastive learning, thus enhancing the expressiveness of the user and item representations. Note that MGCL is not limited to graph learning schema, but also can be applied to most matrix factorization methods. We conduct extensive experiments on three public datasets to validate the effectiveness and scalability of MGCL ¹ ¹ We release the codes of MGCL at https://github.com/hfutmars/MGCL. .

求助该文献

最长约 10秒，即可获得该文献文件

Multimodal Graph Contrastive Learning for Multimedia-Based Recommendation

今日热心研友