强化学习
推荐系统
计算机科学
马尔可夫决策过程
任务(项目管理)
过程(计算)
马尔可夫过程
机器学习
人工智能
冷启动(汽车)
集合(抽象数据类型)
任务分析
人机交互
工程类
统计
操作系统
航空航天工程
程序设计语言
系统工程
数学
作者
Mingsheng Fu,Liwei Huang,Ananya Rao,Athirai A. Irissappane,Jie Zhang,Hong Qu
标识
DOI:10.1109/tii.2022.3209290
摘要
Deep reinforcement learning (DRL) based recommender systems are suitable for user cold-start problems as they can capture user preferences progressively. However, most existing DRL-based recommender systems are suboptimal, since they use the same policy to suit the dynamics of different users. We reformulate recommendation as a multitask Markov Decision Process, where each task represents a set of similar users. Since similar users have closer dynamics, a task-specific policy is more effective than a single universal policy for all users. To make recommendations for cold-start users, we use a default policy to collect some initial interactions to identify the user task, after which a task-specific policy is employed. We use Q-learning to optimize our framework and consider the task uncertainty by the mutual information regarding tasks. Experiments are conducted on three real-world datasets to verify the effectiveness of our proposed framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI