计算机科学
强化学习
地铁列车时刻表
调度(生产过程)
利用
重复(修辞手法)
人工智能
机器学习
数学优化
数学
计算机安全
语言学
操作系统
哲学
作者
Zhengyu Yang,Jian Shen,Yunfei Liu,Yang Yang,Weinan Zhang,Yong Yu
标识
DOI:10.1145/3397271.3401316
摘要
Spaced repetition technique aims at improving long-term memory retention for human students by exploiting repeated, spaced reviews of learning contents. The study of spaced repetition focuses on designing an optimal policy to schedule the learning contents. To the best of our knowledge, none of the existing methods based on reinforcement learning take into account the varying time intervals between two adjacent learning events of the student, which, however, are essential to determine real-world schedule. In this paper, we aim to learn a scheduling policy that fully exploits the varying time interval information with high sample efficiency. We propose the Time-Aware scheduler with Dyna-Style planning (TADS) approach: a sample-efficient reinforcement learning framework for realistic spaced repetition. TADS learns a Time-LSTM policy to select an optimal content according to the student's whole learning history and the time interval since the last learning event. Besides, Dyna-style planning is integrated into TADS to further improve the sample efficiency. We evaluate our approach on three environments built from synthetic data and real-world data based on well-recognized cognitive models. Empirical results demonstrate that TADS achieves superior performance against state-of-the-art algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI