2. School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou 510006, China
The optimal control of nonlinear systems is one of the most challenging and difficult subjects in control theory. A large number of theoretical results about the nonlinear optimal control problems have been reported in the past few decades [1][9]. The dynamic programming algorithm is widely regarded as the most comprehensive method in finding optimal feedback controllers for generic nonlinear systems. However, the main drawback of dynamic programming methods today is the computational complexity required to describe the value function which grows exponentially with the dimension of its domain. It is well known that continuoustime nonlinear optimal control problem depends on the solution of HamiltonJacobiBellman (HJB) equation, which is a nonlinear partial differential equation (PDE). Even in simple cases, the HJB equation may not have global analytic solutions. Various methods have been proposed in the literature for computing numerical solutions to the HJB equation, see for example Aguilar et al. [10], Cacace et al. [11], Govindarajan et al. [12], Markman et al. [13], Sakamoto et al. [8], Smears et al. [14], and the references therein.
It is known that the generalized HamiltonJacobiBellman (GHJB) equation is linear and easier to solve than HJB equation, but no general solution for GHJB equation is demonstrated as to the best knowledge of authors. Beard et al.[15] presented a series of polynomial functions as basic functions to solve the approximate GHJB equation, however, this method requires the computation of a larger number of integrals. Galerkin's spectral approximation approach is proposed in [16] to find approximate but close solutions to the GHJB at each iteration step. The reader is also referred to Markman et al. [13], Smears et al. [14], Saridis et al. [17], Aguilar et al. [10], Gong et al. [18] for more details and different perspectives. Although many articles have discussed the solution to the HJB equation for the continuoustime systems, currently there is very minimal work available on iterative solution approach for GHJB equation.
In this paper, we propose a new iterative method to find the approximate solution to GHJB equation which is associated with the optimal feedback control for nonlinear systems. The idea of this iterative algorithm for GHJB equation is based on Beard's work [16]. Our approach is designed to obtain the general computational solution for the GHJB equation. We firstly convert the GHJB equation to a simple algebraic equation with vector norm, which is essentially a set of nonlinear equations. Then we propose the procedure of this method to compute the solution to the GHJB by considering the linearization of the nonlinear equations under a good initial control guess. The stability and convergence results of the proposed scheme are proved.
The paper is organized as follows. The problem description is presented in Section Ⅱ. The main result of this paper is derived in Section Ⅲ, i.e., the iterative algorithm for the GHJB equation and the detailed mathematical proofs and justifications of the proposed approach. The numerical example is provided in Section Ⅳ. Finally, a brief conclusion is given in Section Ⅴ.
Ⅱ. PROBLEM STATEMENTConsider the following continuoustime affine nonlinear system:
$ \begin{equation}\label{eq21} \left\{\begin{array}{ll} \dot{x}=f(x)+g(x)u\\ x(0)=x_{0} \end{array}\right. \end{equation} $  (1) 
where
The optimal control problem under consideration is to find a state feedback control law
$ \begin{equation} \label{eq22} J(x_{0}, u)=\int^{\infty}_{0}l(x(t, x_{0}))+\ u(x(t, x_{0}))\^{2}dt \end{equation} $  (2) 
where
It is well known that the optimal control can be directly found to be
$ \begin{equation} \label{eq23} \frac{\partial V^{T}}{\partial x}f(x)+l(x)\frac{1}{4}\frac{\partial V^{T}}{\partial x}g(x)(\frac{\partial V^{T}}{\partial x}g(x))^{T}=0. \end{equation} $  (3) 
Although the solution to the nonlinear optimal problem has been wellknown since the early 1960s in [3], relatively few control designs explicitly use a feedback function of the form given in (3). The primary difficulty lies in solving the HJB equation, for which general closedform solutions do not exist. It is noted that the HJB equation is a nonlinear PDE that is impossible to be solved analytically. To obtain its approximate solution, in Saridis [17], the HJB equation was successively approximated by a GHJB equation, written as GHJB
$ \begin{equation} \label{eq24} \frac{\partial V^{T}_{i}}{\partial x}(f+gu^{(i)})+l+\ u^{(i)}\^{2}=0 \end{equation} $  (4) 
with
$ \begin{equation} \label{eq25} u^{(i+1)}(x)=\frac{1}{2}g^{T}\frac{\partial V_{i}}{\partial x}. \end{equation} $  (5) 
The cost of
In many numerical methods for HJB equations, which are typically solved backward in time, the discretization is based on spatial causality and the computation is explicit in time. The value of the solution function
It is obvious that the GHJB equation (4) can be rewritten as
$ \begin{equation} \label{eq31} \frac{\partial V^{T}_{i}}{\partial x}\ f+gu^{(i)}\^{2}=(l+\ u^{(i)}\^{2})(f+gu^{(i)})^{T}. \end{equation} $  (6) 
Let
$ \begin{equation} \label{eq32} p^{(i)}=\frac{l+\ u^{(i)}\^{2}}{\ f+gu^{(i)}\^{2}}(f+gu^{(i)})^{T} \end{equation} $  (7) 
with
$ \begin{equation} \label{eq33} u^{(i+1)}=\frac{1}{2}g^{T}(p^{(i)})^{T} \end{equation} $  (8) 
where
$ g=\left( \begin{matrix} {{g}_{11}} & {{g}_{12}} & \cdots & {{g}_{1m}} \\ {{g}_{21}} & {{g}_{22}} & \cdots & {{g}_{2m}} \\ \vdots & \vdots & \ddots & \vdots \\ {{g}_{n1}} & {{g}_{n2}} & \cdots & {{g}_{nm}} \\ \end{matrix} \right). $  (9) 
Let
$ \x{{\}^{2}}=\sum\limits_{i=1}^{n}{x_{i}^{2}} $ 
then (7) can be rewritten as
$ \begin{eqnarray} p^{(i)}&{}={}&\frac{l+\sum\limits_{k=1}^{m}(u^{(i)}_{k})^{2}}{\ F^{(i)}\^{2}}(F^{(i)}) \nonumber\\ &{}={}&\frac{l+\sum\limits_{k=1}^{m}(u^{(i)}_{k})^{2}}{\sum\limits^{n}_{j=1}(F^{(i)}_j)^{2}}(F^{(i)}) \end{eqnarray} $  (10) 
where
$ \left\{ \begin{array}{l} \dot{x}^{(i)}_{1}=F^{(i)}_1=f_1+g_{11}u^{(i)}_1+g_{12}u^{(i)}_2+\cdots+g_{1m}u^{(i)}_m\\ \dot{x}^{(i)}_{2}=F^{(i)}_2=f_2+g_{21}u^{(i)}_1+g_{22}u^{(i)}_2+\cdots+g_{2m}u^{(i)}_m\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots \\ \dot{x}^{(i)}_{n}=F^{(i)}_n=f_n+g_{n1}u^{(i)}_1+g_{n2}u^{(i)}_2+\cdots+g_{nm}u^{(i)}_m. \end{array} \right. $  (11) 
Then, we obtain the feedback control from (8)
$ \left\{ \begin{array}{l} u^{(i+1)}_1=\frac{1}{2}(g_{11}p^{(i)}_1+g_{21}p^{(i)}_2+\cdots+g_{n1}p^{(i)}_n)\\ u^{(i+1)}_2=\frac{1}{2}(g_{12}p^{(i)}_1+g_{22}p^{(i)}_2+\cdots+g_{n2}p^{(i)}_n)\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots \\ u^{(i+1)}_m=\frac{1}{2}(g_{1m}p^{(i)}_1+g_{2m}p^{(i)}_2+\cdots+g_{nm}p^{(i)}_n). \end{array} \right. $  (12) 
The method proposed in this paper may be implemented by applying the following procedure on the system (1).
In Algorithm 1,
$ \ p\_{p}=\sup\limits_{x\in\Omega}p(x). $ 
Algorithm 1: Iterative Algorithm Based on GHJB Equation 
IF ( 
Solve numerically: 
ELSE 
Restriction: 
Computation: 
Recursion : 
ENDIF 
RETURN 
To apply the method introduced in this paper we must choose an initial stabilizing control
If
$ \begin{equation}\label{eq39} F^{(i)}(x)=F^{(i)}(q)+J_{F^{(i)}}(q)(xq)+o(\ xq\) \end{equation} $  (13) 
for
In this subsection, we conduct the stability and convergence analysis of the proposed scheme. First, we define the preHamiltonian function for some control
$ \begin{equation} \label{eq311} H(x, \frac{\partial V}{\partial x}, u)=l(x)+\ u\^{2}+\frac{\partial V^{T}}{\partial x}(f(x)+g(x)u). \end{equation} $  (14) 
Lemma 1: The optimal control law
$ 0<V^{*}(x)\leq V(x), ~~ u\neq u^{*}. $ 
This lemma is proved in [17] which is a sufficient condition for the optimal control solution to the nonlinear systems.
There is a bulk of literature devoted to the problem of designing stable regulators for nonlinear system. The most important and popular tool is Lyapunov's method. To use Lyapunov's method, the designer first proposes a control and then tries to find a Lyapunov function for the closedloop system. A Lyapunov function is a generalized energy function of the states, and is usually suggested by the physics of the problem. It is often possible to find a stabilizing control for a particular system. Now we show that value function
Theorem 1: Assume
$ \begin{equation}\label{eq312} \frac{\partial V_{1}^{T}}{\partial x}(f+gu^{(1)})+l+\ u^{(1)}\^{2}=0 \end{equation} $  (15) 
associated with
$ u^{(2)}=\frac{1}{2}g^{T}\frac{\partial V_{1}}{\partial x} $ 
then for all
Proof: Since
$ \begin{eqnarray} \label{eq313} \dot{V}_{1}(x, t)&{}={}&\frac{\partial V_{1}}{\partial t}+\frac{\partial V^{T}_{1}}{\partial x}\dot{x} \nonumber\\ &{}={}&\frac{\partial V_{1}}{\partial t}+\frac{\partial V^{T}_{1}}{\partial x}(f+gu^{(2)}) \nonumber\\ &{}={}&\frac{\partial V_{1}}{\partial t}+\frac{\partial V^{T}_{1}}{\partial x}f+\frac{\partial V^{T}_{1}}{\partial x}gu^{(2)} \nonumber\\ &{}={}&\frac{\partial V_{1}}{\partial t}+\frac{\partial V^{T}_{1}}{\partial x}f2(u^{(2)})^{T}u^{(2)} \nonumber \\ &{}={}&\frac{\partial V_{1}}{\partial t}+\frac{\partial V^{T}_{1}}{\partial x}f2\ u^{(2)}\^{2}. \end{eqnarray} $  (16) 
When
$ \begin{equation}\label{eq314} \frac{\partial V_{1}^{T}}{\partial x}f=l\ u^{(1)}\^{2}\frac{\partial V_{1}^{T}}{\partial x}gu^{(1)}. \end{equation} $  (17) 
Substituting (17) into (16), we can get
$ \begin{eqnarray} \dot{V}_{1}(x, t)&{}={}&l\ u^{(1)}\^{2}\frac{\partial V^{T}_{1}}{\partial x}gu^{(1)}2\ u^{(2)}\^{2} \nonumber \\ &{}={}&l\ u^{(1)}\^{2}+2(u^{(2)})^{T}gu^{(1)}2\ u^{(2)}\^{2} \nonumber\\ &{}={}&l(u^{(2)})^{T}u^{(2)}(u^{(1)}u^{(2)})^{T}(u^{(1)}u^{(2)}) \nonumber\\ &{}={}&l\ u^{(2)}\^{2}\ u^{(1)}u^{(2)}\^{2}\nonumber\\ &{}\leq{}&0. \end{eqnarray} $  (18) 
This establishes the boundaries of the trajectories of system (1) over
In this subsection, the Algorithm 1 converges to the optimal control will be shown when it exists. It is clear that the following equation is easily obtained from (7) and (8)
$ \begin{equation} \label{eq315} u^{(i+1)}=\frac{1}{2}g^{T}\frac{l+\ u^{(i)}\^{2}}{\ f+gu^{(i)}\^{2}}gu^{(i)}+\frac{1}{2}g^{T}\frac{l+\ u^{(i)}\^{2}}{\ f+gu^{(i)}\^{2}}f. \end{equation} $  (19) 
Linearizing the equation (19) according to (13), we obtain
$ \begin{equation}\label{eq316} u^{(i+1)}=T(u^{(i)})+o(\ u^{(i)}\) \end{equation} $  (20) 
where
Definition 1: Let
Note for example that a transformation having
Lemma 2: If
Proof: Select an arbitrary element
$ \ u^{(i)}u^{(i1)}\\leq \alpha^{n1}\ u^{(2)}u^{(1)}\. $ 
It follows that
$ \begin{eqnarray} &{}{}&\ u^{(i+m)}u^{(i)}\ \nonumber \\ &{}{}&\leq\, \, \ u^{(i+m)}u^{(i+m1)}\+\ u^{(i+m1)}u^{(i+m2)}\ \nonumber \\ &{}{}&\quad+\cdots+\ u^{(i+1)}u^{(i)}\ \nonumber \\ &{}{}&\leq (\alpha^{i+m2}+\alpha^{i+m3}+\cdots+\alpha^{i1})\ u^{(2)}u^{(1)}\ \nonumber\\ &{}{}&\leq (\alpha^{i1}\sum^{\infty}_{k=0}\alpha^{k})\ u^{(2)}u^{(1)}\ \nonumber\\ &{}{}&=\frac{\alpha^{i1}}{1\alpha}\ u^{(2)}u^{(1)}\ \end{eqnarray} $  (21) 
and hence we conclude that
We now show that
$ \begin{align} \ u^{*}T(u^{*})\ \!\, &=\, \ u^{*}u^{(i)}+u^{(i)}T(u^{*})\ \nonumber \\ &\, \leq \, \ u^{*}u^{(i)} \+\ u^{(i)}T(u^{*})\ \nonumber\\ &\, \leq\, \ u^{*}u^{(i)} \+\alpha\ u^{(i1)}u^{*}\. \nonumber\\ \end{align} $  (22) 
By appropriate choice of
It remains only to show that
$ \ u^{*}\tilde{u}^{*}\\, =\, \ T(u^{*}) T(\tilde{u}^{*})\\leq\ u^{*}\tilde{u}^{*} \. $ 
Thus
Defining
$ \ T(u)T(v)\\, =\, \ (Au+B)(Av+B)\\leq\ A \\cdot\ uv\. $ 
The basic idea of successive approximation and contraction mappings can be modified in several ways to produce convergence theorems for a number of different situations. We consider one such modification below.
Theorem 2: Let
Proof: Let
$ u^{(i+1)}=T(u^{(i)}). $ 
Now since
By the continuity of
$ T(u^{*})=T[\lim\limits_{k\rightarrow\infty} T^{ik}(u^{(1)})]=\lim\limits_{k\rightarrow\infty} T^{ik}[T(u^{(1)})]. $ 
Hence, again using the continuity of
$ \begin{eqnarray} \ u^{*}&{}{}&T(u^{*})\ =\lim\limits_{k\rightarrow\infty} \ T^{nk}(u^{(1)})T^{ik}[T(u^{(1)})]\ \nonumber \\ &{}\leq{}& \lim\limits_{k\rightarrow\infty} \ T^{i}{T^{i(k1)}(u^{(1)}T^{i(k1)}[T(u^{(1)})])}\ \nonumber\\ &{}\leq{}&\alpha \lim\limits_{k\rightarrow\infty}\ T^{i(k1)}(u^{(1)}T^{i(k1)}[T(u^{(1)})])\ \nonumber\\ &{}={}&\alpha \ u^{*}T(u^{*})\ \end{eqnarray} $  (23) 
where
$ \ u^{*}v^{*}\\, =\, \ T^{i}(u^{*})T^{i}(v^{*})\\leq\alpha\ u^{*}v^{*}\ $ 
and hence
In this section, we will show how this solution using the proposed method is worked to obtained a control law which improves the closedloop performance of the original control.
A. One Dimensional Nonlinear SystemA firstorder nonlinear system is described by the state equation
$ \begin{equation}\label{eq41} \dot{x}(t)=x^{3}(t)+u(t) \end{equation} $  (24) 
with initial condition
$ J=\frac{1}{2}\int^{\infty}_{0}(x^{2}+u^{2})dt. $ 
Assume a linear control to start with
$ u^{(1)}=ax, \quad a>0 $ 
it is clear that the system is stable when
$ u^{(1)}=2x $ 
then the system (24) under controller
$ p^{(1)}=\frac{5x^{2}}{4x2x^{3}}. $ 
Download:


Fig. 1 A continuous stabilizing control 
The above yields
$ \begin{equation} u^{(2)}=\frac{1}{2}g^{T}p^{(1)}=\frac{5x^{2}}{4x^{3}8x} \end{equation} $  (25) 
$ \begin{eqnarray} p^{(2)} \!&\!=\!&\!\frac{l+\ u^{(2)}\^{2}}{\ f+gu^{(2)}\^{2}}(f+gu^{(2)})^{T} \nonumber \\ \!&\!=\!&\! \frac{x^2(4x^{3}8x)^225x^2}{10x^2(4x^38x)2x^3(4x^38x)^2}. \nonumber \end{eqnarray} $ 
Then the system (24) under controller
Download:


Fig. 2 A continuous stabilizing control 
To continue the iterative algorithm, we would repeat the preceding steps, using this revised value. If we select
$ p^{(1)}\geq p^{(2)}\geq p^{(3)}\geq p^{(4)}\geq p^{(5)}\geq p^{(6)}. $ 
Eventually the iterative procedure should converge to the optimal control history,
The state equations for a continuous stirredtank chemical reactor are given below from [1]. The flow of a coolant through a coil inserted in the reactor is to control the firstorder, irreversible exothermic reaction taking place in the reactor.
$ \left\{ \begin{align} & {{{\dot{x}}}_{1}}(t)=2[{{x}_{1}}(t)+0.25]+[{{x}_{2}}(t)+0.5]{{e}^{\frac{25{{x}_{1}}(t)}{{{x}_{1}}(t)+2}}} \\ & \ \ \ \ \ \ \ \ \ \ \ \ [{{x}_{1}}(t)+0.25]u(t) \\ & {{{\dot{x}}}_{2}}(t)=0.5{{x}_{2}}(t)[{{x}_{2}}(t)+0.5]{{e}^{\frac{25{{x}_{1}}(t)}{{{x}_{1}}(t)+2}}} \\ \end{align} \right. $  (26) 
with the boundary conditions
$ J=\int^{0.78}_{0}[x^{2}_{1}(t)+x^{2}_{2}(t)+Ru^{2}(t)]dt $ 
indicating that the desired objective is to maintain the temperature and concentration close to their steadystate value without expending large amounts of control effort.
With
Download:


Fig. 3 (a) Optimal control and trajectory; (b) Performance measure reduction by Algorithm 1; (c) Performance measure reduction by Beard in [16]. 
To illustrate the effects of different initial parameters for the control history, different additional solutions were obtained. The results of these computer runs are summarized in Table Ⅰ.
The value of the performance measure
Remark 1: To conclude our discussion of Algorithm 1, let us summarize the important features of the algorithm. First of all, from the nominal control history,
The robot arm problem is taken from [1]. In the formulation the arm of the robot is a rigid bar of length
$ 0\leq\rho(t)\leq L, \ \ \theta(t)\leq\pi, \ \ 0\leq\phi(t)\leq\pi $ 
and for the controls
$ u_{\rho}\leq 1, \ \ u_{\theta}\leq1, \ \ u_{\phi}\leq1. $ 
The equations of motion for the robot arm are
$ \begin{equation} \label{eq43} L \ddot{\rho}=u_{\rho}, \ \ I_{\theta}\ddot{\theta}=u_{\theta}, \ \ I_{\phi}\ddot{\phi}=u_{\phi} \end{equation} $  (27) 
where
$ I_{\theta}=\frac{(L\rho)^{3}+\rho^3}{3}sin^2\phi, \ \ I_{\phi}=\frac{(L\rho)^{3}+\rho^3}{3}. $ 
The boundary conditions are
$ L=5, \ \ \rho(0)=\rho(t_f)=\frac{9}{2}, \ \ \theta(0)=0 $ 
$ \theta(t_f)=\frac{2\pi}{3}, \ \ \rho(0)=\rho(t_f)=\frac{\pi}{4} $ 
$ \dot{\rho}(0)=\dot{\theta}(0)=\dot{\phi}(0)=\dot{t_f}=\dot{\theta}(t_f)=\dot{\phi}(t_f)=0. $ 
This model ignores the fact that the spherical coordinate reference frame is a noninertial frame and should have terms for Coriolis and centrifugal forces. Let
$ J=t_f $ 
subject to the dynamic constraints
$ \begin{align*} &\dot{x_1}=x_2, \ \ \dot{x_2}=\frac{u_{\rho}}{L} \\ &\dot{x_3}=x_4, \ \ \dot{x_4}=\frac{u_{\theta}}{I_{\theta}} \\ &\dot{x_5}=x_6, \ \ \dot{x_6}=\frac{u_{\phi}}{I_{\phi}}. \end{align*} $ 
The control inequality constraints and the boundary conditions are shown in the above statement.
Fig. 4 shows the variables
Download:


Fig. 4 Variables 
Download:


Fig. 5 Control variables 
In this paper, we proposed a new iterative numerical technique to solve the GHJB equation effectively. There is a need for numerical methods which approximate solutions to the special types of equations which arise in nonlinear optimal control. We also showed that the resulting controls are in feedback form and stabilize the closedloop system. The procedure is proved to converge to the optimal solution to GHJB equation with respect to the iteration variable.
[1]  D. E. Kirk, Optimal Control Theory: An Introduction. Mineola, USA: Courier Corporation, 2012. 
[2]  F. L. Lewis and V. L. Syrmos, Optimal Control. New York, USA: Wiley, 1995. 
[3]  R. Bellman, Dynamic Programming. New Jersey, USA: Princeton University Press, 1957. 
[4]  D. Kleinman, "On an iterative technique for riccati equation computations, " IEEE Trans. Autom. Contr., vol. 13, no. 1, pp. 114115, Feb. 1968. http://xueshu.baidu.com/s?wd=paperuri%3A%28c172df92b76abf7a3551f8ab95408337%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fci.nii.ac.jp%2Fnaid%2F30019918845&ie=utf8&sc_us=16210463245255859084 
[5]  F. Y. Wang, H. G. Zhang, and D. R. Liu, "Adaptive dynamic programming: An introduction, " IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 3947, May 2009. http://xueshu.baidu.com/s?wd=paperuri%3A%282a885ce7daacb3621170ec72c98d3cf9%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Ficp.jsp%3Farnumber%3D4840325&ie=utf8&sc_us=17790335501086901414 
[6]  F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers, " IEEE Control Syst., vol. 32, no. 6, pp. 76105, Dec. 2012. http://xueshu.baidu.com/s?wd=paperuri%3A%2860ee797c5b81e82150cf13d045f5fc09%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Ficp.jsp%3Farnumber%3D6315769&ie=utf8&sc_us=15413527564144328925 
[7]  W. Zhang, W. X. Liu, X. Wang, L. M. Liu, and F. Ferrese, "Online optimal generation control based on constrained distributed gradient algorithm, " IEEE Trans. Power Syst., vol. 30, no. 1, pp. 3545, Jan. 2015. http://xueshu.baidu.com/s?wd=paperuri%3A%28d8e857a156c4bc17f4164e996d00b13a%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fdx.doi.org%2F10.1109%2FTPWRS.2014.2319315&ie=utf8&sc_us=4468568846486416817 
[8]  N. Sakamoto and A. J. van der Schaft, "Analytical approximation methods for the stabilizing solution of the HamiltonJacobi equation, " IEEE Trans. Autom. Control, vol. 53, no. 10, pp. 23352350, Nov. 2008. http://www.mendeley.com/research/analyticalapproximationmethodsstabilizingsolutionhamiltonjacobiequation/ 
[9]  T. Bian, Y. Jiang, and Z. P. Jiang, "Adaptive dynamic programming and optimal control of nonlinear nonaffine systems, " Automatica, vol. 50, no. 10, pp. 26242632, Oct. 2014. http://xueshu.baidu.com/s?wd=paperuri%3A%287c5a4b93543a6cceb05ac88bb1e68464%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fdialnet.unirioja.es%2Fservlet%2Farticulo%3Fcodigo%3D4854670&ie=utf8&sc_us=3034413140015552710 
[10]  C. O. Aguilar and A. J. Krener, "Numerical solutions to the Bellman equation of optimal control, " J. Optim. Theory Appl., vol. 160, no. 2, pp. 527552, Feb. 2014. http://link.springer.com/article/10.1007/s1095701304038 
[11]  S. Cacace, E. Cristiani, M. Falcone, and A. Picarelli, "A patchy dynamic programming scheme for a class of HamiltonJacobiBellman equations, " SIAM J. Sci. Comput., vol. 34, no. 5, pp. A2625A2649, Oct. 2012. http://xueshu.baidu.com/s?wd=paperuri%3A%286e34c72862da9ed0180ea341f0816b0c%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fadsabs.harvard.edu%2Fabs%2F2011arXiv1109.3577C&ie=utf8&sc_us=9863127530291330464 
[12]  N. Govindarajan, C. C. de Visser, and K. Krishnakumar, "A sparse collocation method for solving timedependent HJB equations using multivariate Bsplines, " Automatica, vol. 50, no. 9, pp. 22342244, Sep. 2014. http://xueshu.baidu.com/s?wd=paperuri%3A%28232ae8ead0425f554db5acf7747afd31%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fdialnet.unirioja.es%2Fservlet%2Farticulo%3Fcodigo%3D4830522&ie=utf8&sc_us=2303783704440044212 
[13]  J. Markman and I. N. Katz, "An iterative algorithm for solving HamiltonJacobi type equations, " SIAM J. Sci. Comput., vol. 22, no. 1, pp. 312329, Jul. 2000. http://epubs.siam.org/doi/abs/10.1137/S1064827598344315 
[14]  I. Smears and E. Süli, "Discontinuous Galerkin finite element approximation of HamiltonJacobiBellman equations with Cordes coefficients, "SIAM J. Numer. Anal., vol. 52, no. 2, pp. 9931016, Apr. 2014. http://eprints.maths.ox.ac.uk/1671 
[15]  R. W. Beard, G. N. Saridis, and J. T. Wen, "Approximate solutions to the timeinvariant HamiltonJacobiBellman equation, " J. Optim. Theory Appl., vol. 96, no. 3, pp. 589626, Mar. 1998. http://xueshu.baidu.com/s?wd=paperuri%3A%28e806b8acd4f6c5306a9c2870839fa064%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fci.nii.ac.jp%2Fnaid%2F80010226858&ie=utf8&sc_us=15790908467176106295 
[16]  R. W. Beard, G. N. Saridis, and J. T. Wen, "Galerkin approximations of the generalized HamiltonJacobiBellman equation, " Automatica, vol. 33, no. 12, pp. 21592177, Dec. 1997. http://xueshu.baidu.com/s?wd=paperuri%3A%281e50ce49db952defceb5de4d82886749%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D280068%26amp%3Bpreflayout%3Dflat&ie=utf8&sc_us=11738833470501958577 
[17]  G. N. Saridis and C. S. G. Lee, "An approximation theory of optimal control for trainable manipulators, " IEEE Trans. Syst. Man Cybern. vol. 9, no. 3, pp. 152159, Mar. 1979. http://xueshu.baidu.com/s?wd=paperuri%3A%28fafa925f04afc7b2d4167eec6f3a33b4%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fci.nii.ac.jp%2Fnaid%2F80013924674&ie=utf8&sc_us=1501191005870337592 
[18]  Q. Gong, W. Kang, and I. M. Ross, "A pseudospectral method for the optimal control of constrained feedback linearizable systems, " IEEE Trans. Autom. Control, vol. 51, no. 7, pp. 11151129, Jul. 2006. http://www.mendeley.com/catalog/pseudospectralmethodoptimalcontrolconstrainedfeedbacklinearizablesystems/ 