Hierarchical Interest Modeling of Long-tailed Users for Click-Through Rate Prediction
计算机科学
作者
Xu Xie,Jin Niu,Lifang Deng,Yan Wang,Jiandong Zhang,Zhihua Wu,Kaigui Bian,Gang Cao,Bin Cui
标识
DOI:10.1109/icde55515.2023.00234
摘要
Click-through rate (CTR) prediction, whose purpose is to predict the probability of a user clicking on an item, plays a pivotal role in recommender systems. Capturing users' accurate preferences from their historical interactions (e.g., clicks) is an essential step for handling this task and has aroused wide concern in both academia and industry. However, most of the previous methods focus on the users with abundant clicks and ill-serve the users who rarely click or purchase items. Though the ratio of these long-tailed users may be small on popular platforms, such as Amazon and Taobao, they are the majority on the newborn e-commerce company like Lazada. To extract the interests of long-tailed users, several works attempt to integrate the side information, such as demographic features. Nevertheless, these features are usually not available and may even lead to privacy concerns. Therefore, how to utilize the noisy and limited clicks becomes the key challenge.In this paper, we propose a novel model called Hierarchical Interest Modeling (HIM). It hierarchically utilizes long-tailed users' limited behaviors and captures their preferences from both personalized and group-wise perspectives. HIM consists of two main components, including User Behavior Pyramid (UBP) and User Behavior Clustering (UBC). The UBP module utilizes additional negative feedback to reduce the noises in positive feedback, thus obtaining reliable user personalized representations. Then, the UBC module automatically discovers latent user groups with self-supervised reconstruction loss and learns another interest representation for each user in a group-wise aspect. Extensive experiments on both public and industrial datasets verify the superiority of HIM compared with the state-of-the-art baselines. Moreover, HIM has already been deployed on Lazada recommendation scenario and gains 3.38% on CTR prediction on average on the online A/B test. Our codes are available in https://github.com/xiaojin-nj/HIM.