计算机科学
强化学习
体验质量
移动设备
人工智能
实时计算
多媒体
机器学习
计算机网络
服务质量
操作系统
作者
Huanhuan Zhang,Anfu Zhou,Huadong Ma
标识
DOI:10.1109/tmc.2022.3179782
摘要
Machine learning models, particularly reinforcement learning (RL), have demonstrated great potential in optimizing video streaming applications. However, the state-of-the-art solutions are limited to an “offline learning” paradigm, i.e., the RL models are trained in simulators and then are operated in real networks. As a result, they inevitably suffer from the simulation-to-reality gap, showing far less satisfactory performance under real conditions compared with simulated environment. In this work, we close the gap by proposing Legato, an online RL framework for real-time mobile interactive video system. Legato puts many individual RL agents directly into the video system, which make video bitrate decisions in real-time and evolve their models over time. Legato then employs a two-level cooperative learning mechanism to enhance video QoE. Firstly, Legato proposes a score-based robust learning algorithm to eliminate risks of quality degradation caused by the RL model's exploration attempts. Then Legato adaptively aggregates agents following a network condition-aware manner to form its corresponding high-level RL model that can help each individual to react to unseen network conditions. We implement Legato on an interactive real-time video system. Based on the exhaustive evaluations, we find that Legato outperforms the state-of-the-art algorithms significantly across a wide range of QoE metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI