强化学习
马尔可夫决策过程
计算机科学
任务(项目管理)
马尔可夫过程
趋同(经济学)
缩小
过程(计算)
持续时间(音乐)
数学优化
实时计算
人工智能
工程类
数学
程序设计语言
统计
系统工程
经济
经济增长
操作系统
艺术
文学类
作者
Yuanjian Li,A.H. Aghvami
标识
DOI:10.1109/icc45855.2022.9838566
摘要
In cellular-connected unmanned aerial vehicle (UAV) network, a minimization problem on the weighted sum of time cost and expected outage duration is considered. Taking advantage of UAV’s adjustable mobility, an intelligent UAV navigation approach is formulated to achieve the aforementioned optimization goal. Specifically, after mapping the navigation task into a Markov decision process (MDP), a deep reinforcement learning (DRL) solution with novel quantum-inspired experience replay (QiER) framework is proposed to help the UAV find the optimal flying direction within each time slot. Via relating experienced transition’s importance to its associated quantum bit (qubit) and applying Grover-iteration-based amplitude amplification technique, the proposed DRL-QiER solution commits a better trade-off between sampling priority and diversity. Compared to several representative baselines, the effectiveness and supremacy of the proposed DRL-QiER solution are demonstrated and validated in numerical results.
科研通智能强力驱动
Strongly Powered by AbleSci AI