广东工业大学学报  2016, Vol. 33Issue (4): 37-43.  DOI: 10.3969/j.issn.1007-7162.2016.04.007.
0

引用本文 

李洁茗, 朱怀念, 张成科. 广义随机系统的多人Nash微分博弈[J]. 广东工业大学学报, 2016, 33(4): 37-43. DOI: 10.3969/j.issn.1007-7162.2016.04.007.
Li Jie-ming, Zhu Huai-nian, Zhang Cheng-ke. Nash Differential Games of Singular Stochastic Affine Systems with Multiple Decision Makers[J]. Journal of Guangdong University of Technology, 2016, 33(4): 37-43. DOI: 10.3969/j.issn.1007-7162.2016.04.007.

基金项目:

国家自然科学基金资助项目(71571053);广东省自然科学基金资助项目(2015A030310218, 2014A030310366)

作者简介:

李洁茗(1985-),女,硕士研究生,主要研究方向为博弈论及商务管理。

文章历史

收稿日期:2015-05-27
广义随机系统的多人Nash微分博弈
李洁茗1, 朱怀念2, 张成科2    
1. 广东工业大学 国际教育学院, 广东 广州 511495;
2. 广东工业大学 经济与贸易学院, 广东 广州, 510520
摘要: 研究了一类连续时间广义随机系统的多人Nash微分博弈问题.在定义了广义随机系统稳定性的相关概念后,通过一个线性矩阵不等式(linear matrix inequality, LMI)首先给出了系统稳定性的条件.然后,研究了有限时间和无限时间的广义随机系统的多人Nash微分博弈,利用Riccati方程法得到了均衡策略的存在条件等价于耦合的微分或代数Riccati方程存在解,并给出了均衡策略的显式表达及最优性能指标值.最后,将所得的结果应用于现代鲁棒控制中的随机H2/H控制问题,得到了鲁棒控制策略的存在条件及显式表达.
关键词: 广义随机系统    Nash微分博弈    Riccati方程    随机H2/H控制    
Nash Differential Games of Singular Stochastic Affine Systems with Multiple Decision Makers
Li Jie-ming1, Zhu Huai-nian2, Zhang Cheng-ke2    
1. School of International Education, Guangdong University of Technology, Guangzhou 511495, China;
2. School of Economics and Commence, Guangdong University of Technology, Guangzhou 510520, China
A class of Nash differential games of continuous-time singular stochastic affine systems with multiple decision makers is investigated. After establishing some concepts of the stability for stochastic singular systems, the condition of the stability is presented by means of a linear matrix inequality. Then, by utilizing Riccati equation approach, the existence conditions of equilibrium strategy set in finite horizon and infinite horizon are obtained by means of a set of cross coupled differential Riccati equations or a set of cross coupled algebraic Riccati equations, respectively. And explicit expressions of the optimal feedback controls and optimal cost function are given. In the end, the obtained results are used to deal with the stochastic H2/H controlproblem in the fields of modern robust control theory, and the existence conditions of robust control strategies and explicit expressions are obtained.
Key words: singular stochastic systems    Nash differential games    Riccati equation    stochastic H2/H control    

近年来,在经济、管理、自动控制、军事等领域有着广泛应用的线性二次微分博弈,已扩展到广义系统、随机系统等复杂系统.文献[1-3]对连续时间确定性广义线性系统的零和微分博弈、Nash非零和微分博弈、Stackelberg主从微分博弈做出了系统研究,得到了均衡策略存在的条件.文献[4]研究了离散时间确定性广义线性系统的零和微分博弈,得到了均衡策略存在的充分必要条件.文献[5-6]研究了连续时间确定性广义线性系统的Nash微分博弈,分别得到了开环和反馈Nash均衡存在的充分必要条件.文献[7]研究了广义状态系统中线性二次型微分博弈鞍点策略的数值求解问题,基于小波多尺度多分辨逼近特性,提出了一种数值求解新方法.上述这些成果都是针对确定性广义系统取得的.文献[8-10]针对广义随机系统的线性二次微分博弈问题,系统研究了它的Nash均衡策略、Stackelberg策略和Pareto策略及其数值求解算法.文献[11]研究了广义线性系统的随机Nash微分博弈问题,利用Riccati方程法得到了有限时间和无限时间Nash均衡的存在条件及其显式表达.文献[12]讨论了有限时间Itô型随机奇异系统的非零和博弈问题,利用随机控制方法得到了博弈均衡策略存在的充分条件等价于其相应耦合Riccati微分方程存在解.

纵观以上文献可以发现已有的结果都只考虑了扩散性中包含状态变量的情形,然而在实际中,存在扩散项中同时包含状态和控制的情形,工程中的实际例子见文献[13],数理金融方面的例子见文献[14]中的例11.2.5.因而,研究一般意义下扩散项中包含状态和控制的线性二次微分博弈则具有更为广泛的意义.本文在已有研究的基础上,研究噪声同时依赖于状态和控制的广义随机系统的多人Nash微分博弈问题,借助Riccati方程给出了均衡策略的存在条件,并将所得的结果应用于现代鲁棒控制中的随机H2/H控制问题中,拓展了微分博弈的应用.

1 预备知识

$ {\rm{(}}\mathit{\Omega }{\rm{, }}{\cal F}, {\{ {{\cal F}_t}\} _{\mathit{t} \ge {\rm{0}}}}, \mathit{\boldsymbol{P}})$是一个完备概率空间,其上定义了一个标准布朗运动{W(t)}t≥0${\{ {{\cal F}_t}\} _{\mathit{t} \ge {\rm{0}}}} $为{W(t)}t≥0生成的自然信息流.对固定的T > 0,定义下面的空间:

Rnn-维欧式空间,其上的Euclid范数记为‖·‖;

$L_{\cal F}^2\left( {0,T;{{\bf{R}}^{\rm{n}}}} \right):\{ \mathit{\boldsymbol{\varphi }}\left( \cdot \right):{{\cal F}_t}$-适应的Rn-值可测过程,使得$ \mathit{\boldsymbol{E}}\int_0^T \mathit{\boldsymbol{\varphi }} {\left( t \right)^2}{\rm{d}}\mathit{t < }\infty \} $

此外,为了表述的方便,在全文中引入下面记号:

MΤ:矩阵或向量M的转置;Tr(M):矩阵M的迹;det(M):矩阵M的行列式;deg(f):多项式f的次数;Rn×mn×m阶矩阵的全体;$ {\cal C}(0, \mathit{T}{\rm{;}}\mathit{\boldsymbol{X}})$:Banach空间上定义在[0, T]上X-值连续函数的全体.

考虑下式描述的广义随机系统

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = \mathit{\boldsymbol{Ax}}\left( t \right){\rm{d}}\mathit{t }+\mathit{\boldsymbol{Fx}}\left( t \right){\rm{d}}\mathit{W}\left( t \right), \\ \mathit{\boldsymbol{x}}\left( 0 \right) = {\mathit{\boldsymbol{x}}_{\rm{0}}}, \end{array} \right. $ (1)

其中x(·)∈Rn是系统的状态,x0Rn是给定的初始状态;W(·)∈R是一维标准布朗运动;E, A, FRn×n是已知的常数矩阵,E可以是奇异的,且rank(E)=rn.

为了保证式(1) 解的存在唯一性,引入下面的引理.

引理1[15]  如果存在一对非奇异矩阵MRn×nNRn×n,使得对三元组(E, A, F),下述至少一个条件成立时,则式(1) 存在唯一解,

(1)$\mathit{\boldsymbol{MEN = }}\left[{\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{I}}_{{n_1}}}} & 0\\ 0 & \mathit{\boldsymbol{N}} \end{array}} \right], \mathit{\boldsymbol{MAN = }}\left[{\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{A}}_1}} & 0\\ 0 & {{\mathit{\boldsymbol{I}}_{{n_2}}}} \end{array}} \right], \mathit{\boldsymbol{MFN = }}\left[{\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{F}}_1}} & {{\mathit{\boldsymbol{F}}_2}}\\ 0 & 0 \end{array}} \right], $

其中NRn2×n2的幂零矩阵,F1Rn1×n1F2Rn1×n2n1+n2=n.

(2)$\mathit{\boldsymbol{MEN}} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{I}}_r}} & 0\\ 0 & 0 \end{array}} \right],\mathit{\boldsymbol{MAN}} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{A}}_1}} & 0\\ 0 & {{\mathit{\boldsymbol{I}}_{n - r}}} \end{array}} \right],\mathit{\boldsymbol{MFN}} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{F}}_1}} & {{\mathit{\boldsymbol{F}}_2}}\\ 0 & {{\mathit{\boldsymbol{F}}_3}} \end{array}} \right], $

其中A1, F1Rr×rF2Rr×(n-r)F3R(n-r)×(n-r).

在控制理论中,系统的稳定性是一个非常重要的概念,它是系统能否正常工作的最基本条件,因而在研究广义随机系统的微分博弈问题之前,下面先给出有关系统稳定性的一些定义和引理.

定义1[15]  对于系统(1)

(1) 若存在常数s,使得det(sE-A)≠0,则称系统(1) 是正则的;

(2) 若deg(det(sE-A))=rank(E),则称系统(1) 是无脉冲的;

(3) 若对于任意的允许初态x0Rn,系统(1) 的解x(t)满足$ \mathop {\lim }\limits_{t \to \infty } E\mathit{\boldsymbol{x}}{\left( t \right)^2} = 0$,则称系统(1) 是渐近均方稳定的;

(4) 系统(1) 是渐近均方容许的,如果它是正则、无脉冲且渐近均方稳定的.

引理2[14]  设一个n-维过程x(·)满足随机微分方程

$ {\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = \mathit{\boldsymbol{f}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{dt + }\mathit{\boldsymbol{g}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{dW}\left( t \right). $

给定$\mathit{V}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right) \in {{\cal C}^2}\left( {[0, \mathit{T}] \times {{\bf{R}}^\mathit{n}}} \right) $,则有

$ \begin{array}{l} \;\;\;\mathit{dV}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right) = \mathit{\Gamma V}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{dt }+\\ \mathit{\boldsymbol{V}}_\mathit{x}^{\rm{T}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{g}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{dW}\left( t \right), \end{array} $

其中$ \begin{array}{l} \mathit{\Gamma V}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right) = {V_t}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right) + \mathit{\boldsymbol{V}}_x^{\rm{T}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)f\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\\ + \frac{1}{2}{\rm{Tr}}\left[{{\mathit{\boldsymbol{g}}^{\rm{T}}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right){\mathit{\boldsymbol{V}}_{xx}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)\mathit{\boldsymbol{g}}\left( {t, \mathit{\boldsymbol{x}}\left( t \right)} \right)} \right]. \end{array}$

引理3  如果存在一个n×n阶非奇异对称矩阵P,使得下述LMI成立

$ {\mathit{\boldsymbol{A}}^T}\mathit{\boldsymbol{PE + }}{\mathit{\boldsymbol{E}}^{\rm{T}}}\mathit{\boldsymbol{PA + }}{\mathit{\boldsymbol{F}}^{\rm{T}}}\mathit{\boldsymbol{PF < }}{\rm{0, }} $ (2)

则系统(1) 是渐近均方容许的.

证明  参照文献[15]的Remark6,选取形如V(x(t))=xΤ(t)EΤPEx(t)的Lypunov函数V,然后采取文献[16]中的分析方法,不难证得系统(1) 是渐近均方容许的.引理3证毕.

2 有限时间Nash博弈 2.1 问题描述

首先在有限时间[0, T]上考虑N(N≥2) 个博弈人的Nash微分博弈问题.设[s, y]∈[0,TRn是给定的初始时间和初始状态,$ {\rm{(}}\mathit{\Omega }{\rm{, }}{\cal F}, {\{ {{\cal F}_t}\} _{\mathit{t} \ge {\rm{0}}}}, \mathit{\boldsymbol{P}})$是一个完备概率空间,其上定义了一个标准布朗运动W(·),记

$ {{\cal U}_\mathit{i}}[\mathit{s}, \mathit{T}] = \{ {\mathit{\boldsymbol{u}}_i}\left( \cdot \right):[\mathit{s}, \mathit{T}] \times \mathit{\Omega } \to {{\bf{R}}^{{\mathit{m}_\mathit{i}}}}|{\mathit{\boldsymbol{u}}_i}\left( \cdot \right)$是自适应平方可积的过程,且$ \mathit{\boldsymbol{E}}\int_s^T {{\mathit{u}_\mathit{i}}} {\left( t \right)^2}{\rm{d}}\mathit{t < }\infty \} $.

ui(·)为博弈人ii=1, …, N的控制策略.对于每一个(u1(·), …, uN(·))∈U[s, T]≡U1[s, T]×…×UN[s, T],博弈人i的性能指标为

$ \begin{array}{*{20}{l}} \begin{array}{l} \;\;\;\mathit{\boldsymbol{J}}_i^{\rm{T}}({\mathit{\boldsymbol{u}}_1}\left( \cdot \right), \cdots ,{\mathit{\boldsymbol{u}}_N}\left( \cdot \right)) = \\ \mathit{\boldsymbol{E}}\left\{ {\int_s^T {\left[ {{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{Q}}_i}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + \mathit{\boldsymbol{u}}_i^{\rm{T}}\left( t \right){\mathit{\boldsymbol{R}}_i}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)} \right]} } \right.{\rm{d}}t + \end{array}\\ {\left. {{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( T \right){\mathit{\boldsymbol{H}}_i}\mathit{\boldsymbol{x}}\left( T \right)} \right\},} \end{array} $ (3)

其中i=1, …, NQi(·)和Hin×n非负定有界对称矩阵,Ri(·)是mi×mi有界对称矩阵,x(·)是下述状态方程的解,

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = \left[{\mathit{\boldsymbol{A}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right)\mathit{ }+\sum\limits_{i = 1}^N {{\mathit{\boldsymbol{B}}_i}\left( t \right)} {\mathit{\boldsymbol{u}}_\mathit{i}}\left( t \right)} \right]{\rm{d}}t + \\ \;\;\;\;\left[{\mathit{\boldsymbol{C}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right)\mathit{ }+\sum\limits_{i = 1}^N {{\mathit{\boldsymbol{D}}_i}\left( t \right)} {\mathit{\boldsymbol{u}}_\mathit{i}}\left( t \right)} \right]{\rm{d}}\mathit{W}\left( t \right), \\ \mathit{\boldsymbol{x}}\left( \mathit{s} \right) = \mathit{\boldsymbol{y}}, \end{array} \right. $ (4)

式(4) 中,E是给定的n×n常数矩阵,且rank(E)=rnA(·)和C(·)是n×n有界矩阵,Bi(·)和Di(·)是n×mi有界矩阵.

这里的问题是去寻找所谓的Nash均衡点${\rm{(}}\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_N^*\left( \cdot \right)) \in {\cal U}[\mathit{s}, \mathit{T}] $, 使得

$ \begin{array}{l} \;\;\;\;\;\;J_i^T\left( {\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_N^*\left( \cdot \right)} \right) \le \mathit{\boldsymbol{J}}_i^{\rm{T}}(\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \\ \mathit{\boldsymbol{u}}_{i - 1}^*\left( \cdot \right), {\mathit{\boldsymbol{u}}_i}\left( \cdot \right), \mathit{\boldsymbol{u}}_{i + 1}^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_N^*\left( \cdot \right)), \\ \forall {\mathit{\boldsymbol{u}}_i}\left( \cdot \right) \in {{\cal U}_i}\left[{s, T} \right]. \end{array} $ (5)

这时的博弈系统由N个博弈人所组成,这类博弈问题在工程和经济管理中有很强的实际应用背景.

2.2 主要结果

在给出有限时间N人Nash微分博弈的主要结果前,本文先讨论它的一种特殊情况,即N=1的情形,学术界称之为随机LQ问题,所得到的结论将为研究Nash微分博弈奠定基础.

考虑下述受控系统

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = \left[{\mathit{\boldsymbol{A}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right)+{\mathit{\boldsymbol{B}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \mathit{t} \right)} \right]{\rm{d}}t + \\ \;\;\;\;\left[{\mathit{\boldsymbol{C}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right)+{\mathit{\boldsymbol{D}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \mathit{t} \right)} \right]{\rm{d}}\mathit{W}\left( t \right), \\ \mathit{\boldsymbol{x}}\left( \mathit{s} \right) = \mathit{\boldsymbol{y}}{\rm{.}} \end{array} \right. $ (6)

对应的性能指标为

$ \begin{array}{l} \;\;{\mathit{J}^{\rm{T}}}({\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \cdot \right)) = \mathit{\boldsymbol{E}}\left\{ {\int_s^T {{\rm{[}}{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{Q}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right)} + } \right.\\ \mathit{\boldsymbol{u}}_1^{\rm{T}}\left( t \right){\mathit{\boldsymbol{R}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \mathit{t} \right)]{\rm{d}}\mathit{t }+\left. {{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( T \right){\mathit{\boldsymbol{H}}_{\rm{1}}}\mathit{\boldsymbol{x}}\left( \mathit{T} \right)} \right\}. \end{array} $ (7)

引理4  若下述推广的微分Riccati方程

$ \left\{ \begin{array}{l} {\mathit{\boldsymbol{E}}^{\rm{T}}}{{\mathit{\boldsymbol{\dot P}}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{E = }}{\rm{ - (}}{\mathit{\boldsymbol{A}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{E + }}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{A}}\left( \mathit{t} \right) + \\ \;\;\;\;{\mathit{\boldsymbol{C}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{C}}\left( \mathit{t} \right) + {\mathit{\boldsymbol{Q}}_{\rm{1}}}\left( t \right) + \\ {\rm{(}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{B}}_{\rm{1}}}\left( t \right) + {\mathit{\boldsymbol{C}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{D}}_{\rm{1}}}\left( t \right))({\mathit{\boldsymbol{R}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{ + }}\\ \;\;\;\;\mathit{\boldsymbol{D}}_{_{\rm{1}}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{D}}_1}\left( t \right){)^{ - 1}} \times (\mathit{\boldsymbol{B}}_{_{\rm{1}}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{E}} + \\ \;\;\;\;\mathit{\boldsymbol{D}}_{_{\rm{1}}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{C}}\left( t \right)), \\ {\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( T \right)\mathit{\boldsymbol{E}} = {\mathit{\boldsymbol{H}}_{\rm{1}}}, \\ {\mathit{\boldsymbol{K}}_1}\left( t \right) = {\mathit{\boldsymbol{R}}_1}\left( t \right) + \mathit{\boldsymbol{D}}_{_{\rm{1}}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\\ \mathit{\boldsymbol{D}}\left( t \right) > 0, \\ \;\;\;{\rm{a}}.{\rm{e}}.{\rm{t}} \in [0, \mathit{T}] \end{array} \right. $ (8)

存在n×n有界对称矩阵P1(·),则有限时间随机LQ问题(6)~(7) 的最优控制和最优指标分别为

$ \begin{array}{l} \mathit{\boldsymbol{u}}_1^*\left( t \right) = - \mathit{\boldsymbol{K}}_1^{ - 1}\left( t \right){\mathit{\boldsymbol{L}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{x}}\left( \mathit{t} \right), {\mathit{\boldsymbol{J}}^{\rm{T}}}\left( {\mathit{\boldsymbol{u}}_1^*\left( \cdot \right)} \right) = \\ {\mathit{\boldsymbol{y}}^{\rm{T}}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_1}\left( s \right)\mathit{\boldsymbol{Ey}}{\rm{, }} \end{array} $ (9)

其中L1(t)=B1T(t)P1(t)E+D1T(t)P1(t)C(t).

证明  利用配方法证明,假设式(8) 存在解P1(·),取V(t)=xT(t)ETP1(t)Ex(t),考虑到式(6),对V(t)使用Itô公式得:

$ \begin{array}{*{20}{l}} {\;\;\;{\rm{d}}V\left( t \right) = {\rm{d}}({\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{Ex}}\left( t \right)){\rm{ = d}}({\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{E}}^{\rm{T}}}) \times }\\ {{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{Ex}}\left( t \right) + {\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\rm{d}}({\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{E})\mathit{\boldsymbol{x}}\left( t \right){\rm{ + }}}\\ {{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right){\rm{d}}(\mathit{\boldsymbol{Ex}}\left( t \right)){\rm{ + d}}({\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{E}}^{\rm{T}}}){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right) \times }\\ {{\rm{d}}(\mathit{\boldsymbol{Ex}}\left( t \right)) = }\\ {\{ {\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)[{\mathit{\boldsymbol{E}}^{\rm{T}}}{{\mathit{\dot P}}_{\rm{1}}}\left( t \right)E + {\mathit{\boldsymbol{A}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{E}} + }\\ {{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{A}}\left( t \right) + {\mathit{\boldsymbol{C}}^{\rm{T}}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{C}}\left( t \right)]\mathit{\boldsymbol{x}}\left( t \right) + }\\ {2{\mathit{\boldsymbol{u}}^{\rm{T}}}\left( t \right)[\mathit{\boldsymbol{B}}_{\rm{1}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{E}} + \mathit{\boldsymbol{D}}_1^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{C}}\left( t \right)]\mathit{\boldsymbol{x}}\left( t \right) + }\\ {{\mathit{\boldsymbol{u}}^{\rm{T}}}\left( t \right)\mathit{\boldsymbol{D}}_{\rm{1}}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_{\rm{1}}}\left( t \right){\mathit{\boldsymbol{D}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{u}}\left( t \right)\} {\rm{d}}t + \{ \cdots \} {\rm{d}}\mathit{W}\left( \mathit{t} \right).} \end{array} $ (10)

将式(10) 在[s, T]上积分,取数学期望,再结合式(7) 得

$ \begin{array}{l} \;\;\;\;{\mathit{\boldsymbol{J}}^{\rm{T}}}\left( {\mathit{\boldsymbol{u}}\left( \cdot \right)} \right) = {J^{\rm{T}}}\left( {\mathit{\boldsymbol{u}}\left( \cdot \right)} \right) + {\bf{E}}\{ \int_s^T {{\rm{d}}V\left( t \right)} + \\ V\left( t \right)\left| {_s^T} \right.\} = {\bf{E}}\int_s^T {{\rm{\{ }}\mathit{\boldsymbol{u}}_1^{\rm{T}}\left( t \right)} [{\mathit{\boldsymbol{R}}_1}\left( t \right) + \mathit{\boldsymbol{D}}_1^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_1}\left( t \right) \times \\ {\mathit{\boldsymbol{D}}_1}\left( t \right)]{\mathit{\boldsymbol{u}}_1}\left( t \right) + 2\mathit{\boldsymbol{u}}_1^{\rm{T}}\left( t \right){L_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + {\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)\mathit{\boldsymbol{L}}_1^{\rm{T}}\left( t \right) \times \\ [{\mathit{\boldsymbol{R}}_{\rm{1}}}\left( t \right)\mathit{\boldsymbol{D}}_1^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_1}\left( t \right){\mathit{\boldsymbol{D}}_1}\left( t \right)]{\mathit{\boldsymbol{L}}_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + \\ \mathit{\boldsymbol{E}}{\rm{\{ }}{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( T \right)[{\mathit{\boldsymbol{H}}_1}-{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( T \right)\mathit{\boldsymbol{E}}{\rm{]}}\mathit{\boldsymbol{x}}\left( T \right)\} + \\ {\mathit{\boldsymbol{y}}^{\rm{T}}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{1}}}\left( s \right)\mathit{\boldsymbol{Ey = }}{\bf{E}}\int_s^T {{\rm{\{ [}}{\mathit{\boldsymbol{u}}_1}\left( t \right)} + ({\mathit{\boldsymbol{R}}_1}\left( t \right) + \\ \mathit{\boldsymbol{D}}_{_1}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_1}\left( t \right){\mathit{\boldsymbol{D}}_1}\left( t \right)){\mathit{\boldsymbol{L}}_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right){]^{\rm{T}}}({\mathit{\boldsymbol{R}}_1}\left( t \right) + \\ \mathit{\boldsymbol{D}}_{_1}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_1}\left( t \right){\mathit{\boldsymbol{D}}_1}\left( t \right)) \times [{\mathit{\boldsymbol{u}}_{\rm{1}}}\left( t \right) + ({\mathit{\boldsymbol{R}}_1}\left( t \right) + \\ \mathit{\boldsymbol{D}}_{_1}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_1}\left( t \right){\mathit{\boldsymbol{D}}_1}\left( t \right)){\mathit{\boldsymbol{L}}_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right)]\} {\rm{d}}t + \\ {\mathit{\boldsymbol{y}}^{\rm{T}}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{P}_{\rm{1}}}\left( s \right)\mathit{\boldsymbol{Ey}}{\rm{.}} \end{array} $ (11)

由式(11) 容易得到最优反馈控制和最优性能指标分别为

$ \begin{array}{l} \;\;\;\;{\mathit{\boldsymbol{u}}^\mathit{\boldsymbol{*}}}\left( t \right) = - \mathit{\boldsymbol{K}}_1^{ - 1}\left( t \right){\mathit{\boldsymbol{L}}_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right), \;{\mathit{J}^{\rm{T}}}\left( {\mathit{\boldsymbol{u}}_1^*\left( \cdot \right)} \right) = \\ {\mathit{\boldsymbol{y}}^{\rm{T}}}{\mathit{\boldsymbol{E}}^{\rm{T}}}\mathit{\boldsymbol{P}}\left( s \right)\mathit{\boldsymbol{Ey}}{\rm{.}} \end{array} $

引理4得证.

下述定理给出了有限时间N人Nash微分博弈的主要结果.

定理1  若下述推广的耦合微分Riccati方程(12) 存在n×n有界对称矩阵Pi(·),

$ \left\{ \begin{array}{l} {\mathit{\boldsymbol{E}}^{\rm{T}}}{{\mathit{\boldsymbol{\dot P}}}_\mathit{i}}\left( t \right)\mathit{\boldsymbol{E = }}{\rm{ - (}}\mathit{\boldsymbol{\bar A}}_{ - i}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_i}\left( t \right)\mathit{\boldsymbol{E + }}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_\mathit{i}}\left( t \right){{\mathit{\boldsymbol{\bar A}}}_{{\rm{ - }}\mathit{i}}}\left( t \right) + \\ \;\;\;\mathit{\boldsymbol{\bar C}}_{ - i}^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_i}\left( t \right){{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}}\left( t \right) + {\mathit{\boldsymbol{Q}}_i}\left( t \right)) + \\ \;\;\;\mathit{\boldsymbol{L}}_i^{\rm{T}}\left( t \right)\mathit{\boldsymbol{K}}_i^{ - 1}\left( t \right){\mathit{\boldsymbol{L}}_i}\left( t \right), \\ {\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_i}\left( T \right)\mathit{\boldsymbol{E = }}{\mathit{\boldsymbol{H}}_\mathit{i}}, \\ {\mathit{\boldsymbol{K}}_i}\left( t \right) = {\mathit{\boldsymbol{R}}_i}\left( t \right) + \mathit{\boldsymbol{D}}_i^{\rm{T}}\left( t \right){\mathit{\boldsymbol{P}}_\mathit{i}}\left( t \right){\mathit{\boldsymbol{D}}_i}\left( t \right) > 0, \\ \;\;\;{\rm{a}}.{\rm{e}}.\;\forall \mathit{t} \in {\rm{[0, }}\mathit{T}{\rm{], }} \end{array} \right. $ (12)

其中${{\mathit{\boldsymbol{\bar A}}}_{\mathit{ - i}}}\left( t \right) = \mathit{\boldsymbol{A}}\left( t \right) - \sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{B}}_i}\left( t \right){\mathit{\boldsymbol{K}}_i}\left( t \right){\mathit{\boldsymbol{L}}_i}\left( t \right)} $$ {{\mathit{\bar C}}_{\mathit{ - i}}}\left( t \right) = \mathit{\boldsymbol{C}}\left( t \right) - \sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{D}}_i}\left( t \right){\mathit{\boldsymbol{K}}_i}\left( t \right){\mathit{\boldsymbol{L}}_i}\left( t \right)} $, Li(t)=BiΤ(t)Pi(t)E+DiΤ(t)Pi(t)C-i(t),i=1, …, N则形如ui*(t)=-Ki-1(t)Li(t)x(t),t∈的[0,T]的N元组(ui*(t), …, uN*(t))是Nash微分博弈(3)-(5) 的Nash均衡点,且最优性能指标为

$ \mathit{J}_i^{\rm{T}}{\rm{(}}\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_N^*\left( \cdot \right)) = {\mathit{\boldsymbol{y}}^{\rm{T}}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_\mathit{i}}\left( s \right)\mathit{\boldsymbol{Ey}}{\rm{.}} $

证明  考虑下述优化问题(13)

$ \begin{array}{*{20}{l}} {\mathop {\min }\limits_{{u_i}\left( \cdot \right) \in {\cal U_i}[s, T]} \varphi ({\mathit{u}_i}\left( \cdot \right)) = \mathit{\boldsymbol{E}}\{ \int_s^T {[{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)} {\mathit{\boldsymbol{Q}}_i}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + }\\ \begin{array}{l} \mathit{\boldsymbol{u}}_i^{\rm{T}}\left( t \right){\mathit{\boldsymbol{R}}_i}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}t + {\mathit{\boldsymbol{x}}^{\rm{T}}}\left( T \right){\mathit{\boldsymbol{H}}_i}x\left( T \right)\}, \\ \begin{array}{*{20}{l}} {{\rm{s}}.{\rm{t}}.}\\ {\left\{ {\begin{array}{*{20}{l}} {\mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = [{{\mathit{\boldsymbol{\bar A}}}_{{\rm{-}}i}}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + {\mathit{\boldsymbol{B}}_i}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}t{\rm{ + }}}\\ {\;\;\;[{{\mathit{\boldsymbol{\bar C}}}_{{\rm{-}}i}}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + {\mathit{\boldsymbol{D}}_i}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}W\left( t \right), }\\ {\mathit{\boldsymbol{x}}\left( s \right) = \mathit{\boldsymbol{y}}.} \end{array}} \right.} \end{array} \end{array} \end{array} $ (13)

注意到上述优化问题中的φ与引理4中的JT(u1(·))形式上正好一致,根据引理4,令$\begin{array}{l} {{\mathit{\boldsymbol{\bar A}}}_{{\rm{ - }}\mathit{i}}} \Rightarrow \mathit{\boldsymbol{A}}{\rm{, }}{\mathit{\boldsymbol{B}}_i} \Rightarrow {\mathit{\boldsymbol{B}}_1}, {{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}} \Rightarrow {\mathit{\boldsymbol{C}}_1}, {\mathit{\boldsymbol{D}}_i} \Rightarrow {\mathit{\boldsymbol{D}}_1}, {\mathit{\boldsymbol{Q}}_i} \Rightarrow {\mathit{\boldsymbol{Q}}_1}, {\mathit{\boldsymbol{R}}_i} \Rightarrow {\mathit{\boldsymbol{R}}_1},\Rightarrow {\mathit{\boldsymbol{Q}}_1}, {\mathit{\boldsymbol{H}}_i} \Rightarrow {\mathit{\boldsymbol{H}}_1} \end{array} $,则由引理4得

$ \begin{array}{l} \mathit{\boldsymbol{u}}_1^*\left( t \right) = - \mathit{\boldsymbol{K}}_1^{ - 1}\left( t \right){\mathit{\boldsymbol{L}}_1}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) \Rightarrow \mathit{\boldsymbol{u}}_i^*\left( t \right) = \\ - \mathit{\boldsymbol{K}}_i^{ - 1}\left( t \right){\mathit{\boldsymbol{L}}_i}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right), \end{array} $

且最优性能指标等于yTETPi(s)Ey.

定理1得证.

3 无限时间Nash博弈 3.1 问题描述

现在讨论无限时间[0, ∞]上N人Nash微分博弈问题.记

$ \begin{array}{l} \;\;\;\;\;\mathit{J}_i^\infty \left( {{\mathit{\boldsymbol{u}}_1}\left( \cdot \right), \cdots, {\mathit{\boldsymbol{u}}_\mathit{N}}\left( \cdot \right)} \right) = \\ \mathit{\boldsymbol{E}}\{ \int_0^\infty {[{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)} {\mathit{\boldsymbol{Q}}_i}\left( t \right)\mathit{\boldsymbol{x}}\left( t \right) + \mathit{\boldsymbol{u}}_i^{\rm{T}}\left( t \right){\mathit{\boldsymbol{R}}_i}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{t}\}, \end{array} $ (14)

其中i=1, …, Nx(·)是下述状态方程的解,

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = [\mathit{\boldsymbol{Ax}}\left( t \right) + \sum\limits_{i = 1}^N {{\mathit{\boldsymbol{B}}_i}{\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{t}} + \\ \;\;\;\;[\mathit{\boldsymbol{Cx}}\left( t \right) + \sum\limits_{i = 1}^N {{\mathit{\boldsymbol{D}}_i}{\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{W}\left( t \right)}, \\ \mathit{\boldsymbol{x}}\left( 0 \right) = {\mathit{\boldsymbol{x}}_0}. \end{array} \right. $ (15)

式中的E是给定的n×n常数矩阵,且rank(E)=rn.ABiCDiQiRii=1, …, N为具有适当维数的常数矩阵,W(·)是完备概率空间$(\mathit{\Omega },{\cal F},{\{ {{\cal F}_t}\} _{t \ge {\rm{0}}}},{\bf{P}})$上的标准布朗运动.记

$\mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{m_i}}}} \right) = \{ {\mathit{\boldsymbol{u}}_i}\left( \cdot \right):[0, \infty ) \to {{\bf{R}}^{{m_i}}}|{\mathit{\boldsymbol{u}}_i}\left( \cdot \right) $${{\cal F}_t} $-自适应的,且$ \mathit{\boldsymbol{E}}{\int_0^\infty {{\mathit{\boldsymbol{u}}_i}\left( t \right)} ^2}{\rm{d}}\mathit{t < }\infty {\rm{\} }}$,i=1,…N.

$ \begin{array}{*{20}{l}} {\;\;\;\;\;{\cal U}[0,\infty ) = }\\ {\{ ({\mathit{\boldsymbol{u}}_1}\left( \cdot \right), \cdots ,{\mathit{\boldsymbol{u}}_N}\left( \cdot \right)) \in \mathit{\boldsymbol{L}}_{\cal F}^2\left( {{{\bf{R}}^{{m_1}}}} \right) \times \cdots \times \mathit{\boldsymbol{L}}_{\cal F}^2\left( {{{\bf{R}}^{{m_N}}}} \right)} \end{array} $

相应于(u1(·), …, uN(·))状态方程(15) 的解$ \mathit{\boldsymbol{x}}\left( \cdot \right) \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^n}} \right)\} .$

这里的问题是去寻找所谓的Nash均衡点$ {\rm{(}}\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots ,\mathit{\boldsymbol{u}}_\mathit{N}^*\left( \cdot \right)) \in {\cal U}[0,\infty )$使得

$ \begin{array}{l} \;\;\;\;\;\mathit{J}_i^\infty \left( {\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \mathit{u}_\mathit{N}^*\left( \cdot \right)} \right) \le \mathit{\boldsymbol{J}}_i^\infty {\rm{(}}\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \\ \mathit{\boldsymbol{u}}_{\mathit{i- }{\rm{1}}}^*\left( \cdot \right), {\mathit{\boldsymbol{u}}_i}\left( \cdot \right), \mathit{\boldsymbol{u}}_{i + 1}^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_\mathit{N}^*\left( \cdot \right)), \\ \forall {\mathit{\boldsymbol{u}}_i}\left( \cdot \right) \in {\cal U}[0, \infty ). \end{array} $ (16)

由于是在无限时间域上考虑问题,本文需要均方稳定性概念.

定义1[9]  Itô方程Edx=(Ax+Bu)dt+(Cx+Du)dWx(0)=x0描述的随机受控系统称为均方稳定的,如果对于任意的初值x0,存在反馈控制u=Kx,其中K是具有适当维数的常数矩阵,使得闭环系统Edx=(A+BK)xdt+(C+DK)xdWx(0)=x0是渐近均方稳定的,即$\mathop {\lim }\limits_{t \to \infty } \mathit{\boldsymbol{E}}\left\| \mathit{\boldsymbol{x}} \right.{\left. {\left( t \right)} \right\|^2} = 0. $.

假设1  系统(15) 是均方可稳的.

3.2 主要结果

无限时间随机LQ问题的主要结果如下定理2所示(由于证明过程与定理1类似,这里不再赘述).

定理2  在假设1成立的条件下,若下述耦合的代数Riccati方程(17) 存在n×n对称矩阵Pi

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{\bar A}}_{ - i}^{\rm{T}}{\mathit{\boldsymbol{P}}_\mathit{i}}\mathit{\boldsymbol{E}} + {\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_\mathit{i}}{{\mathit{\boldsymbol{\bar A}}}_{{\rm{ - }}\mathit{i}}} + \mathit{\boldsymbol{\bar C}}_{ - i}^{\rm{T}}{\mathit{\boldsymbol{P}}_\mathit{i}}{{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}}{\rm{ + }}{\mathit{\boldsymbol{Q}}_\mathit{i}} - \\ \;\;\mathit{\boldsymbol{L}}_i^{\rm{T}}\mathit{\boldsymbol{K}}_i^{ - 1}{\mathit{\boldsymbol{L}}_\mathit{i}} = 0, \\ {\mathit{\boldsymbol{K}}_i} = {\mathit{\boldsymbol{R}}_i} + \mathit{\boldsymbol{D}}_i^{\rm{T}}{\mathit{\boldsymbol{R}}_i}{\mathit{\boldsymbol{D}}_i} > 0, \end{array} \right. $ (17)

其中$ {{\mathit{\boldsymbol{\bar A}}}_{\mathit{ - i}}} = \mathit{\boldsymbol{A}} - \sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{B}}_i}{\mathit{\boldsymbol{K}}_i}{\mathit{\boldsymbol{L}}_i}, } {{\mathit{\boldsymbol{\bar C}}}_{\mathit{ - i}}}= \mathit{\boldsymbol{C}} - \sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{D}}_i}{\mathit{\boldsymbol{K}}_i}{\mathit{\boldsymbol{L}}_i}, } $Li=BiTPiE+DiTPiC-ii=1, …, N,则形如ui*(t)=-Ki-1Lix(t)的N元组的(u1*(t), …, uN*(t))是Nash微分博弈(14)-(16) 的Nash均衡点,且最优性能指标为$ \mathit{J}_i^\infty \left( {\mathit{\boldsymbol{u}}_1^*\left( \cdot \right), \cdots, \mathit{\boldsymbol{u}}_\mathit{N}^*\left( \cdot \right)} \right) = \mathit{\boldsymbol{x}}_0^{\rm{T}}{\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_\mathit{i}}\mathit{\boldsymbol{E}}{\mathit{\boldsymbol{x}}_{\rm{0}}}{\rm{.}}$

4 Nash博弈应用于随机H2/H控制

近年来,确定或者随机系统的H2/H控制问题受到了学者广泛研究,并成功应用于各个领域.在H2/H控制问题的处理方法中,Nash博弈方法成为了处理H2/H控制问题的一种有效方法.Limebeer等[17]运用Nash博弈方法研究了线性系统的混合H2/H控制问题,给出了解存在的充分必要条件是其相应的耦合Riccati方程存在解.Chen和Zhang[18]利用配方法把该结果推广至线性Itô系统的随机H2/H控制问题中,得到了解存在的充要条件是4个交叉耦合的Hamilton-Jacobi方程存在解,该文的一个主要贡献在于把随机H2/H控制看成一个二人非零和的Nash博弈问题,通过求解Nash均衡点(u*, v*)即可得到随机H2/H控制策略.关于确定广义线性系统的H2/H控制问题,最新的研究成果见文献[19].

本部分拟在前人研究的基础上,利用上述得到的Nash博弈的相关结果研究多决策者情形下广义线性系统的随机H2/H控制问题.

为了简化符号,本文仅讨论无限时间的随机H2/H控制问题,有限时间的分析与此类似.

考虑如下的受控系统:

$ \left\{ \begin{array}{l} \mathit{\boldsymbol{E}}{\rm{d}}\mathit{\boldsymbol{x}}\left( t \right) = [\mathit{\boldsymbol{Ax}}\left( t \right) + {\mathit{\boldsymbol{B}}_0}\mathit{\boldsymbol{v}}\left( t \right) + \\ \;\;\;\;\sum\limits_{i = 1}^N {{\mathit{\boldsymbol{B}}_i}{\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{t}} + \;[\mathit{\boldsymbol{Cx}}\left( t \right) + {\mathit{\boldsymbol{D}}_0}\mathit{\boldsymbol{v}}\left( {\bf{t}} \right) + \\ \;\;\;\;\sum\limits_{{\bf{i}} = 1}^N {{\mathit{\boldsymbol{D}}_i}{\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{W}\left( t \right)} ,\\ \mathit{\boldsymbol{x}}\left( 0 \right) = {\mathit{\boldsymbol{x}}_0}. \end{array} \right. $ (18)

受控输出是一个向量

$ z\left( t \right) = \left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{Mx}}\left( t \right)}\\ {\begin{array}{*{20}{l}} {{\mathit{\boldsymbol{G}}_1}{\mathit{\boldsymbol{u}}_1}\left( t \right)}\\ {\;\;\;\; \vdots } \end{array}}\\ {{\mathit{\boldsymbol{G}}_N}{\mathit{\boldsymbol{u}}_N}\left( t \right)} \end{array}} \right],\mathit{\boldsymbol{G}}_i^{\rm{T}}{\mathit{\boldsymbol{G}}_i} = {\mathit{\boldsymbol{I}}_{{m_i}}}. $ (19)

式(18) 中,x(·)∈Rn为系统状态,ui(·)∈Rmi为第i个控制输入,v(·)∈Rnv表示外界干扰,W(·)∈R是一维标准布朗运动,系数矩阵ACB0D0MBiDiGi(i=1, …, N)是具有适当维数的常数矩阵.性能指标定义为

$ \begin{array}{l} \;\;\;{J_0}{\rm{(}}{\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \cdot \right), \cdots, {\mathit{\boldsymbol{u}}_\mathit{N}}\left( \cdot \right), \mathit{\boldsymbol{v}}\left( \cdot \right)) = \\ \mathit{\boldsymbol{E}}{\rm{\{ }}\int_0^\infty {[\left\| {\left. {z{{\left( t \right)}^2}-{\mathit{\gamma }^{\rm{2}}}} \right\|} \right.} \mathit{\boldsymbol{v}}{\left( t \right)^2}]{\rm{d}}\mathit{t}{\rm{\}, }} \end{array} $ (20)
$ \begin{array}{l} \;\;\;{J_i}{\rm{(}}{\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \cdot \right), \cdots, {\mathit{\boldsymbol{u}}_\mathit{N}}\left( \cdot \right), \mathit{\boldsymbol{v}}\left( \cdot \right)) = \\ \mathit{\boldsymbol{E}}{\rm{\{ }}{\int_0^\infty {[{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)\mathit{\boldsymbol{Qx}}\left( t \right)\left\| {\left. {{\mathit{\boldsymbol{u}}_i}\left( t \right)} \right\|} \right.} ^2}]{\rm{d}}\mathit{t}{\rm{\}, }} \end{array} $ (21)

其中i=1, …, NQ=MΤM.

无限时间随机H2/H控制问题的定义如下:

定义2[20]  给定干扰抑制水平γ > 0,寻找控制$ \mathit{\boldsymbol{u}}_i^*\left( t \right) \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{m_i}}}} \right)$$ {\mathit{\boldsymbol{v}}^\mathit{\boldsymbol{*}}}\left( t \right) \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{n_v}}}} \right)$使得ui(t)=ui*(t)=Kix(t),v(t)=v*(t)=Fx(t),i=1, …, N满足下列条件:

(1)ui*(t)使得系统(18) 是渐近均方容许的,即当v(t)=0,ui(t)ui*(t)时,式(18) 对应的闭环系统是正则、无脉冲和渐近均方稳定的;

(2) 对$ \forall \mathit{\boldsymbol{v}}\left( t \right) \ne 0 \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{{\rm{n}}_v}}}} \right)$,初始状态x0=0的闭环系统(18) 的状态过程满足

$ \begin{array}{l} \;\;\;\;\;\mathit{\boldsymbol{E}}\int_0^\infty {\mathit{z}{{\left( t \right)}^{\rm{2}}}{\rm{d}}\mathit{t = }\mathit{\boldsymbol{E}}\int_0^\infty {[{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)\mathit{\boldsymbol{Qx}}\left( t \right)+} } \\ \sum\limits_{i = 1}^N {\mathit{\boldsymbol{u}}_i^{\rm{T}}\left( t \right){\mathit{\boldsymbol{u}}_i}\left( t \right)]{\rm{d}}\mathit{t}} \le \;{\mathit{\gamma }^{\rm{2}}}\mathit{\boldsymbol{E}}\int_0^\infty {\mathit{\boldsymbol{v}}{{\left( t \right)}^2}} {\rm{d}}\mathit{t}. \end{array} $

(3) 当最坏外部干扰$ {\mathit{\boldsymbol{v}}^*}\left( t \right) \in \mathit{\boldsymbol{L}}_{\cal F}^2\left( {{{\bf{R}}^{{\mathit{n}_v}}}} \right)$存在时,将其代入系统(18),ui*(t)最小化能量输出

$ \begin{array}{l} \;\;\;\;{J_i}{\rm{(}}{\mathit{\boldsymbol{u}}_{\rm{1}}}\left( \cdot \right), \cdots, {\mathit{\boldsymbol{u}}_\mathit{N}}\left( \cdot \right), \mathit{\boldsymbol{v}}\left( \cdot \right)) = \\ \mathit{\boldsymbol{E}}{\rm{\{ }}\int_0^\infty {[{\mathit{\boldsymbol{x}}^{\rm{T}}}\left( t \right)\mathit{\boldsymbol{Qx}}\left( t \right)} + {\mathit{\boldsymbol{u}}_i}{\left( t \right)^2}]{\rm{d}}\mathit{t}{\rm{\} }}{\rm{.}} \end{array} $

如果将上述的第i个控制输入ui(t)看作是博弈人i的控制策略,外界干扰v(t)看作虚拟博弈人“自然”的控制策略,那么上述随机H2/H控制问题等价于寻找如下定义的Nash均衡点(u1*, …, uN*, v):

$ \begin{array}{l} \;\;\;{J_0}{\rm{(}}\mathit{\boldsymbol{u}}_1^*, \cdots, \mathit{\boldsymbol{u}}_N^*, {\mathit{\boldsymbol{v}}^*}) \le {J_0}{\rm{(}}\mathit{\boldsymbol{u}}_1^*, \cdots, \mathit{\boldsymbol{u}}_N^*, \mathit{\boldsymbol{v}}), \\ \forall \mathit{\boldsymbol{v}} \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{\mathit{n}_v}}}} \right), \end{array} $ (22)
$ \begin{array}{l} \;\;\;\;{J_i}{\rm{(}}\mathit{\boldsymbol{u}}_1^*, \cdots, \mathit{\boldsymbol{u}}_N^*, {\mathit{\boldsymbol{v}}^*}) \le {J_i}{\rm{(}}\mathit{\boldsymbol{u}}_1^*, \cdots, \mathit{\boldsymbol{u}}_{i - 1}^*, {\mathit{\boldsymbol{u}}_\mathit{i}}{\rm{, }}\mathit{\boldsymbol{u}}_{i + 1}^*{\rm{, }}\\ \cdots, \mathit{\boldsymbol{u}}_N^*, {\mathit{\boldsymbol{v}}^*}), \forall {\mathit{\boldsymbol{u}}_\mathit{i}} \in \mathit{L}_{\cal F}^2\left( {{{\bf{R}}^{{m_i}}}} \right). \end{array} $ (23)

不等式(22) 与H性能有关,不等式(23) 与H2性能有关.如果定义2中定义的(u1*, …, uN*, v*)存在,这里就称随机H2/H控制问题存在解(u1*, …, uN*, v*),其中的ui*(t)就是本文找的随机H2/H控制器,v*(t)就是最坏干扰.

下述定理将文献[19]的结果推广到了噪声依赖于(x, u, v)且存在多个决策者的随机情形.

定理3  对于系统(18),假设如下耦合的代数Riccati方程存在解Pi > 0,i=0, 1, …, NKiF

$ {\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_\mathit{i}}{{\mathit{\boldsymbol{\bar A}}}_{\mathit{ - i}}}\mathit{\boldsymbol{ + \bar A}}_{ - i}^{\rm{T}}{\mathit{\boldsymbol{P}}_\mathit{i}}\mathit{\boldsymbol{E}} + \mathit{\boldsymbol{\bar C}}_{ - i}^{\rm{T}}{\mathit{\boldsymbol{P}}_\mathit{i}}{{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}}{\rm{ + }}{\mathit{\boldsymbol{Q}}_\mathit{i}} - \mathit{\boldsymbol{L}}_{ - i}^{\rm{T}}\mathit{\boldsymbol{R}}_{ - i}^{{\rm{ - 1}}}{\mathit{\boldsymbol{L}}_\mathit{i}} = 0, $ (24)
$ {\mathit{\boldsymbol{E}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{0}}}\mathit{\boldsymbol{\hat A + }}{{\mathit{\boldsymbol{\hat A}}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{0}}}\mathit{\boldsymbol{E}} + {{\mathit{\boldsymbol{\hat C}}}^{\rm{T}}}{\mathit{\boldsymbol{P}}_{\rm{0}}}\mathit{\boldsymbol{\hat C}}{\rm{ + }}\mathit{\boldsymbol{\hat L}}_0^{\rm{T}}\mathit{\boldsymbol{R}}_0^{{\rm{ - 1}}}{{\mathit{\boldsymbol{\hat L}}}_{\rm{0}}} + \mathit{\boldsymbol{\hat Q}} = 0, $ (25)

其中

$ \begin{array}{l} \mathit{i} = 1, \cdots, N, {{\mathit{\boldsymbol{\bar A}}}_{\mathit{ - i}}} = \mathit{\boldsymbol{A}} + {\mathit{\boldsymbol{B}}_0}\mathit{\boldsymbol{F + }}\sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{B}}_j}{\mathit{\boldsymbol{K}}_j}, } {{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}} =\\ \mathit{\boldsymbol{C}}{\rm{ + }}{\mathit{\boldsymbol{D}}_0}\mathit{\boldsymbol{F + }}\sum\limits_{j = 1, j \ne i}^N {{\mathit{\boldsymbol{D}}_j}{\mathit{\boldsymbol{K}}_j}, } {\mathit{\boldsymbol{R}}_\mathit{i}} = {\mathit{\boldsymbol{I}}_{{\mathit{m}_i}}} + \mathit{\boldsymbol{D}}_i^{\rm{T}}{\mathit{\boldsymbol{P}}_i}{\mathit{\boldsymbol{D}}_i}{\rm{, }}{\mathit{\boldsymbol{L}}_i} =\\ \mathit{\boldsymbol{B}}_i^{\rm{T}}{\mathit{\boldsymbol{P}}_i}\mathit{\boldsymbol{E + D}}_i^{\rm{T}}{\mathit{\boldsymbol{P}}_i}{{\mathit{\boldsymbol{\bar C}}}_{{\rm{ - }}\mathit{i}}} ,\end{array}\\ \mathit{\boldsymbol{\hat A}} = \mathit{\boldsymbol{A + }}\sum\limits_{i = 1}^N {{\mathit{\boldsymbol{B}}_i}{\mathit{\boldsymbol{K}}_i}{\rm{, }}\mathit{\boldsymbol{\hat C}}} = \mathit{\boldsymbol{C}} + \sum\limits_{i = 1}^N {{\mathit{\boldsymbol{D}}_i}{\mathit{\boldsymbol{K}}_i}{\rm{, }}{\mathit{\boldsymbol{R}}_{\rm{0}}}} = {\mathit{\gamma }^{\rm{2}}}{\mathit{\boldsymbol{I}}_{{n_v}}} - \mathit{\boldsymbol{D}}_0^{\rm{T}}{\mathit{\boldsymbol{P}}_0}{\mathit{\boldsymbol{D}}_0}{\rm{, }}{{\mathit{\boldsymbol{\hat L}}}_{\rm{0}}} =\\ \mathit{\boldsymbol{B}}_0^{\rm{T}}{\mathit{\boldsymbol{P}}_0}\mathit{\boldsymbol{E + }}\mathit{\boldsymbol{D}}_0^{\rm{T}}{\mathit{\boldsymbol{P}}_0}\mathit{\boldsymbol{\hat C}}, \mathit{\boldsymbol{\hat Q = Q + }}\sum\limits_{i = 1}^N {\mathit{\boldsymbol{K}}_i^{\rm{T}}{\mathit{\boldsymbol{K}}_i}, } $

则随机H2/H控制问题存在形如下式的解

$ \begin{array}{l} \;\;\;\;{\mathit{\boldsymbol{u}}_i}\left( t \right) = \mathit{\boldsymbol{u}}_i^*\left( t \right) = {\mathit{\boldsymbol{K}}_i}\mathit{\boldsymbol{x}}\left( t \right) = - \mathit{\boldsymbol{R}}_i^{\rm{T}}{\mathit{\boldsymbol{L}}_i}\mathit{\boldsymbol{x}}\left( t \right), \\ \mathit{\boldsymbol{v}}\left( t \right) = {\mathit{\boldsymbol{v}}^*}\left( {\rm{t}} \right){\rm{ = }}\mathit{\boldsymbol{Fx}}\left( t \right) = \mathit{\boldsymbol{R}}_0^{\rm{T}}{{\mathit{\boldsymbol{\hat L}}}_0}\mathit{\boldsymbol{x}}\left( t \right). \end{array} $ (26)

证明:根据上一节讨论的无限时间Nash微分博弈得到的定理2,不难证明定理3是成立的,这里不再给出具体的证明过程.

注1:对于形如式(24)、(25) 的Riccati方程,可以使用文献[20]提出的基于LMI的半定规划方法进行求解.

5 结论

本文针对噪声依赖于状态和控制的广义随机系统讨论了其线性二次多人Nash微分博弈问题.在引入广义随机系统的稳定性概念后,通过一个LMI首先给出了广义随机系统的稳定性条件.然后,利用随机LQ问题的结果研究了有限时间Nash微分博弈问题,借助一组耦合的微分Riccati方程给出了有限时间Nash均衡的存在条件.接着,又将有限时间Nash微分博弈问题推广至无限时间情形,得到无限时间Nash均衡的存在条件等价于一组耦合的代数Riccati方程存在解.最后,将所得的相关结果应用于现代鲁棒控制中的随机H2/H控制问题,得到了鲁棒控制策略的存在条件及显式表达.本文所得的研究结果充实了微分博弈理论.

参考文献
[1] XU H, MIZUKAMI K. Two-person two-criteria decision making problems for descriptor systems[J]. Journal of Optimization Theory and Applications, 1993, 78(1): 163-173. DOI: 10.1007/BF00940706.
[2] XU H, MIZUKAMI K. New sufficient conditions for linear feedback closed-loop stackelberg strategy of descriptor system[J]. IEEE Transactions on Automatic Control, 1994, 39(5): 1097-1102. DOI: 10.1109/9.284902.
[3] XU H, MIZUKAMI K. Linear-quadratic zero-sum differential games for generalized state space systems[J]. IEEE Transactions on Automatic Control, 1994, 39(1): 143-147. DOI: 10.1109/9.273352.
[4] XU H, MIZUKAMI K. The linear quadratic dynamic game for discrete-time descriptor systems[J]. International Game Theory Review, 2003, 5(4): 361-374. DOI: 10.1142/S0219198903001100.
[5] ENGWERDA J C, SALMAH Y. The open-loop linear quadratic differential game for index one descriptor systems[J]. Automatica, 2009, 45(2): 585-592. DOI: 10.1016/j.automatica.2008.09.012.
[6] ENGWERDA J C, SALMAH Y. Feedback Nash equilibria for linear quadratic descriptor differential games[J]. Automatica, 2012, 48(4): 625-631. DOI: 10.1016/j.automatica.2012.01.004.
[7] 张成科. 奇异线性二次型微分鞍点对策的小波逼近解法[J]. 系统工程与电子技术, 2003, 25(6): 707-711.
ZHANG C K. Wavelet approximation method for linear-quadratic differential saddle-point game in singular systems[J]. Systems Engineering and Electronics, 2003, 25(6): 707-711.
[8] MUKAIDANI H. Efficient numerical procedures for solving closed-loop Stackelberg strategies with small singular perturbation parameter[J]. Applied mathematics and computation, 2007, 188(2): 1173-1183. DOI: 10.1016/j.amc.2006.10.068.
[9] MUKAIDANI H. Soft-constrained stochastic Nash games for weakly coupled large-scale systems[J]. Automatica, 2009, 45(5): 1272-1279. DOI: 10.1016/j.automatica.2008.12.020.
[10] MUKAIDANI H, XU H. Pareto optimal strategy for stochastic weakly coupled large scale systems with state dependent system noise[J]. IEEE Transactions on Automatic Control, 2009, 54(9): 2244-2250. DOI: 10.1109/TAC.2009.2026854.
[11] ZHOU H, ZHU H, ZHANG C. Linear Quadratic Nash Differential Games of Stochastic Singular Systems[J]. Journal of Systems Science and Information, 2014, 2(6): 553-560.
[12] 周海英, 张成科, 朱怀念. 有限时间随机奇异系统的非零和博弈[J]. 广东工业大学学报, 2014, 31(2): 32-35.
ZHOU H Y, ZHANG C K, ZHU H N. Finite-time nonzero-sum games for stochastic singular systems[J]. Journal of Guangdong University of Technology, 2014, 31(2): 32-35.
[13] QIAN L, GAJIC Z. Variance minimization stochastic power control in CDMA systems[J]. IEEE Transactions on Wireless Communications, 2006, 5(1): 193-202. DOI: 10.1109/TWC.2006.1576543.
[14] ?KSENDALB. Stochastic differential equations: an introduction with applications[M]. 5th ed. New York: Springer, 1998: 236-239.
[15] ZHANG W, ZHAO Y, SHENG L. Some remarks on stability of stochastic singular systems with state-dependent noise[J]. Automatica, 2015, 51(1): 273-277.
[16] XU S, VAN DOOREN P, STEFAN R, et al.Robust stability and stabilization for singular systems with state delay and parameter uncertainty[J]. IEEE Transactions on Automatic Control, 2002, 47(7): 1122-1128. DOI: 10.1109/TAC.2002.800651.
[17] LIMEBEER D J N, ANDERSON B D O, HENDEL B. A Nash game approach to mixed H2/H control[J]. IEEE Transactions on Automatic Control, 1994, 39(1): 69-82. DOI: 10.1109/9.273340.
[18] CHEN B S, ZHANG W. STOCHASTIC H2/H control with state-dependent noise[J]. IEEE Transactions on Automatic Control, 2004, 49(1): 45-57. DOI: 10.1109/TAC.2003.821400.
[19] YAN Z, ZHANG G, WANG J. Infinite horizon H-two/H-infinity control for descriptor systems: Nash game approach[J]. Journal of Control Theory and Applications, 2012, 10(2): 159-165. DOI: 10.1007/s11768-012-0038-6.
[20] MUKAIDANI H, XU H, YAMAMOTO T, et al. Static output feedback H2/H control of infinite horizon Markov jump linear stochastic systems with multiple decision makers[C]//2012 IEEE 51st Annual Conference on Decision and Control. Maui, HI: IEEE, 2012: 6003-6008.