Frank L. Lewis,Draguna Vrabie,Kyriakos G. Vamvoudakis
出处
期刊:IEEE Control Systems Magazine [Institute of Electrical and Electronics Engineers] 日期:2012-11-16卷期号:32 (6): 76-105被引量:949
标识
DOI:10.1109/mcs.2012.2214134
摘要
This article describes the use of principles of reinforcement learning to design feedback controllers for discrete- and continuous-time dynamical systems that combine features of adaptive control and optimal control. Adaptive control [1], [2] and optimal control [3] represent different philosophies for designing feedback controllers. Optimal controllers are normally designed of ine by solving Hamilton JacobiBellman (HJB) equations, for example, the Riccati equation, using complete knowledge of the system dynamics. Determining optimal control policies for nonlinear systems requires the offline solution of nonlinear HJB equations, which are often difficult or impossible to solve. By contrast, adaptive controllers learn online to control unknown systems using data measured in real time along the system trajectories. Adaptive controllers are not usually designed to be optimal in the sense of minimizing user-prescribed performance functions. Indirect adaptive controllers use system identification techniques to first identify the system parameters and then use the obtained model to solve optimal design equations [1]. Adaptive controllers may satisfy certain inverse optimality conditions [4].