计算机科学
利用
变压器
机器学习
同种类的
多模态
图形
情报检索
人工智能
数据挖掘
自然语言处理
理论计算机科学
万维网
数学
量子力学
组合数学
计算机安全
物理
电压
作者
Yong Liu,Susen Yang,Chenyi Lei,Guoxin Wang,Haitao Tang,Jie Zhang,Aixin Sun,Miao Chen
出处
期刊:ACM Multimedia
日期:2021-10-17
被引量:28
标识
DOI:10.1145/3474085.3475709
摘要
Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation and click-through ratio prediction. In addition, we also report a case study of testing PMGT in an online setting with 600 thousand users.
科研通智能强力驱动
Strongly Powered by AbleSci AI