排名(信息检索)
强化学习
计算机科学
会话(web分析)
马尔可夫决策过程
任务(项目管理)
机器学习
基线(sea)
编码(集合论)
推荐系统
人工智能
学习排名
决策树
马尔可夫链
马尔可夫过程
情报检索
数据挖掘
万维网
工程类
集合(抽象数据类型)
地质学
程序设计语言
系统工程
海洋学
统计
数学
作者
Menghui Zhu,Wei Xia,Weiwen Liu,Yifan Liu,Ruiming Tang,Weinan Zhang
标识
DOI:10.1145/3543873.3584651
摘要
With the development of recommender systems, it becomes an increasingly common need to mix multiple item sequences from different sources. Therefore, the integrated ranking stage is proposed to be responsible for this task with re-ranking models. However, existing methods ignore the relation between the sequences, thus resulting in local optimum over the interaction session. To resolve this challenge, in this paper, we propose a new model named NFIRank (News Feed Integrated Ranking with reinforcement learning) and formulate the whole interaction session as a MDP (Markov Decision Process). Sufficient offline experiments are provided to verify the effectiveness of our model. In addition, we deployed our model on Huawei Browser and gained 1.58% improvements in CTR compared with the baseline in online A/B test. Code will be available at https://gitee.com/mindspore/models/tree/master/research/recommend/NFIRank.
科研通智能强力驱动
Strongly Powered by AbleSci AI