前馈
控制理论(社会学)
反推
计算机科学
跟踪(教育)
强化学习
控制工程
最优控制
跟踪误差
控制器(灌溉)
控制(管理)
自适应控制
数学优化
工程类
人工智能
数学
教育学
生物
农学
心理学
作者
Boyang Zhang,Maolong Lv,Shaohua Cui,Xiangwei Bu,Ju H. Park
标识
DOI:10.1109/tase.2023.3322028
摘要
Notwithstanding the successful design of state-of-the-art cooperative control protocols to accomplish formation tracking for multiple unmanned aerial vehicles (UAVs), the assurance of performance optimality cannot be guaranteed in the face of complex disturbances affecting these multi-UAV systems. In order to surmount this challenge, this research endeavor aims to establish a feedforward-feedback learning-based optimal control methodology to facilitate cooperative UAV formation tracking in the presence of intricate disturbances. To be more precise, by leveraging backstepping-based feedback control, the problem of UAV formation tracking is transformed into an equivalent optimal regulation problem. Consequently, a learning-based feedforward control scheme is devised, wherein the cooperative policy iteration algorithm is formulated based on a two-player zero-sum game. The critic-only echo state network (ESN) is employed to approximate the optimal feedforward control policies, with the inclusion of an online adaptive tuning law and compensation terms to alleviate the persistence of excitation condition and eliminate the need for an initial admissible control. As a result, the closed-loop stability is guaranteed in terms of uniformly ultimately boundedness for tracking errors and ESN weights. Note to Practitioners —In real-world scenarios, the flight of multiple UAVs is invariably affected by intricate disturbances, resulting in compromised tracking precision. There is an urgent need to enhance resistance to disturbances and ensure optimal performance for cooperative formation tracking of multiple UAVs. Beyond the capabilities of model-based controllers, the integration of reinforcement learning has shown promise in achieving robust control actions. By introducing the cooperative policy iteration algorithm based on a two-player zero-sum game, the tracking performances of UAV formation can be further optimized. In order to facilitate the practical application of reinforcement learning in UAV systems, our proposed algorithm addresses the persistency of excitation condition by incorporating innovative compensation terms into the ESN tuning law. Furthermore, we resolve the requirement for initial admissible control by introducing a novel piecewise compensation term into the ESN tuning law, which is based on a newly proposed Lyapunov function.
科研通智能强力驱动
Strongly Powered by AbleSci AI