Multimodal Hierarchical Graph Collaborative Filtering for Multimedia-Based Recommendation

计算机科学协同过滤图形人机交互多通道交互情报检索人工智能推荐系统机器学习多媒体理论计算机科学

作者

Kang Liu,Feng Xue,Shuaiyang Li,Sheng Sang,Richang Hong

出处

期刊：IEEE Transactions on Computational Social Systems [Institute of Electrical and Electronics Engineers]
日期：2022-12-15 卷期号：11 (1): 216-227 被引量：8

标识

DOI：10.1109/tcss.2022.3226862

摘要

Multimedia-based recommendation (MMRec) is a challenging task, which goes beyond the collaborative filtering (CF) schema that only captures collaborative signals from interactions and explores multimodal user preference cues hidden in complex multimedia content. Despite the significant progress of current solutions for MMRec, we argue that they are limited by multimodal noise contamination. Specifically, a considerable amount of preference-irrelevant multimodal noise (e.g., the background, layout, and brightness in the image of the product) is incorporated into the representation learning of items, which contaminates the modeling of multimodal user preferences. Moreover, most of the latest researches are based on graph convolution networks (GCNs), which means that multimodal noise contamination is further amplified because noisy information is continuously propagated over the user–item interaction graph as recursive neighbor aggregations are performed. To address this problem, instead of the common MMRec paradigm which learns user preferences in an integrated manner, we propose a hierarchical framework to separately learn collaborative signals and multimodal preferences cues, thus preventing multimodal noise from flowing into collaborative signals. Then, to alleviate the noise contamination for multimodal user preference modeling, we propose to extract semantic entities from multimodal content that are more relevant to user interests, which can model semantic-level multimodal preferences and thus remove a large fraction of noise. Furthermore, we use the full multimodal features to model content-level multimodal preferences like the existing MMRec solutions, which ensures the sufficient utilization of multimodal information. Overall, we develop a novel model, multimodal hierarchical graph CF (MHGCF), which consists of three types of GCN modules tailored to capture collaborative signals, semantic-level preferences, and content-level preferences, respectively. We conduct extensive experiments to demonstrate the effectiveness of MHGCF and its components. The complete data and codes of MHGCF are available at https://github.com/hfutmars/MHGCF .

求助该文献

最长约 10秒，即可获得该文献文件

Multimodal Hierarchical Graph Collaborative Filtering for Multimedia-Based Recommendation

今日热心研友