1. 西安理工大学自动化与信息工程学院 西安 710048;
2. 西安工业大学陕西省自主系统与智能控制国际联合研究中心 西安 710021

A Controller Design Algorithm with Learning Property
SHANG Ting1, QIAN Fu-Cai1,2, ZHANG Xiao-Yan1, XIE Guo1
1. School of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048;
2. The International Joint Research Center of Autonomous Systems and Intelligent Control, Xi'an Technological University, Xi'an 710021
Manuscript received : June 27, 2016, accepted: November 3, 2016.
Foundation Item: Supported by National Natural Science Foundation of China (61273127, U1534208, 61533014) and Science and Technology Project of Shaanxi Province (2016GY-108)
Corresponding author. QIAN Fu-Cai Professor at the School of Automation and Information Engineering, Xi0an University of Technology. His research interest covers stochastic control, systems identiflcation, nonlinear control, optimal control, fault diagnosis and GPS system. Corresponding author of this paper.E-mail:qianfc@xaut.edu.cn
Recommended by Associate Editor FANG Hai-Tao
Abstract: A new controller design algorithm with learning characteristic is proposed for the ubiquitous stochastic optimal control problem with unknown parameters. This algorithm estimates system unknown parameters by Kalman filter and obtains control gains by dynamic programming and continuous rolling optimization mechanism. In order to endow the controller with learning characteristics a learning control component which minimizes next moment estimated variance is attached to the LQG control law. Simulation results show the effectiveness of the algorithm.
Key words: Adaptive control     uncertainty systems     LQG problem     Kalman filter

1 问题描述

 $\begin{array}{l} x(k + 1) = a(k)x(k) + b(k)u(k) + w(k),\\ \qquad \qquad \qquad \qquad \quad \;\;k = 0,1, \cdots ,N - 1 \end{array}$ (1)

 $u_c^*(k) = - L(k)x(k)$ (17)
 $L(k) = {D^{ - 1}}(k){{\hat b}^{\rm{T}}}S(k + 1)\hat a$ (18)
 $D(k) = {{\hat b}^{\rm{T}}}S(k + 1)\hat b + R$ (19)
 $S(k) = {{\hat a}^{\rm{T}}}S(k + 1)\hat a + Q - {L^{\rm{T}}}(k)D(k)L(k)$ (20)
 $S(N) = {Q_N}$ (21)

 ${u^*}(k) = u_c^*(k) + \alpha u_l^*(k)$ (22)

 $\begin{array}{*{20}{l}} {|P(k)|}&{ = \left| {[I - K(k)\Phi (k)]P(k|k - 1)} \right| = }\\ \;&{\left| {I - \frac{{P(k|k - 1){\Phi ^{\rm{T}}}(k)\Phi (k)}}{{\Phi (k)P(k|k - 1){\Phi ^{\rm{T}}}(k) + {\Sigma _w}}}} \right| \times }\\ \;&{\left| {P(k|k - 1)} \right| = }\\ \;&{\left| {1 - \frac{{\Phi (k)P(k|k - 1){\Phi ^{\rm{T}}}(k)}}{{\Phi (k)P(k|k - 1){\Phi ^{\rm{T}}}(k) + {\Sigma _w}}}} \right| \times }\\ \;&{\left| {P(k|k - 1)} \right| = }\\ \;&{\frac{{\left| {P(k|k - 1)} \right|{\Sigma _w}}}{{\Phi (k)P(k|k - 1){\Phi ^{\rm{T}}}(k) + {\Sigma _w}}} = }\\ \;&{\frac{{\left| {P(k|k - 1)} \right|{\Sigma _w}}}{{{P_{aa}}{x^2}(k) + 2{P_{ab}}x(k)u(k) + {P_{bb}}{u^2}(k) + {\Sigma _w}}}} \end{array}$ (23)

3 滚动学习控制算法

 图 1 滚动学习控制算法原理 Figure 1 Control algorithm principle of rolling learning

4 仿真分析

 \begin{align} x(k+1) =a(\theta)x(k)+b(\theta)u(k)+w(k)\label{eq3}\end{align} (28)

 $J = {\rm{E}}\left\{ {{Q_N}{x^2}(N) + \sum\limits_{k = 0}^{N - 1} {\left[ {Q{x^2}(k) + R{u^2}(k)} \right]} } \right\}$ (29)

 $\bar u(k) = - \bar L(k)x(k)$ (30)
 $\bar L(k) = {{\bar D}^{ - 1}}(k){{\bar b}^{\rm{T}}}\bar S(k + 1)\bar a$ (31)
 $\bar D(k) = {{\bar b}^{\rm{T}}}\bar S(k + 1)\bar b + R$ (32)
 $\bar S(k) = {{\bar a}^{\rm{T}}}\bar S(k + 1)\bar a + Q - {{\bar L}^{\rm{T}}}(k)\bar D(k)\bar L(k)$ (33)
 $\bar S(N) = {Q_N}$ (34)

 图 2 a的估计过程 Figure 2 The estimation process of a
 图 3 b的估计过程 Figure 3 The estimation process of b
5 结论

