弹道
计算机科学
控制理论(社会学)
物理
控制(管理)
人工智能
天文
作者
Yuanjian Li,A.H. Aghvami
标识
DOI:10.1109/icc45855.2022.9839093
摘要
In the presence of Warden's detection, a maximization problem on transmission throughput from unmanned aerial vehicle (UAV) to legitimate nodes is considered and solved via UAV trajectory design, subject to covert, velocity and mobility constraints. With the building-distribution-based pathloss model and the Warden's uncertain location model, the formulated optimization problem is challenging to be tackled through standard offline optimization methods. Alternatively, a twin delayed deep deterministic policy gradient (TD3) approach enhanced by multi-step learning and prioritized experience replay (PER) techniques, termed as multi-step TD3-PER, is proposed to help the UAV adaptively select velocity from continuous action space. Numerical results demonstrate the effectiveness of the proposed multi-step TD3-PER solution and showcase the corresponding superiorities against provided baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI