广东工业大学学报  2016, Vol. 33Issue (5): 1-4, 8.  DOI: 10.3969/j.issn.1007-7162.2016.05.001.
0

引用本文 

苗晴, 唐颂, 凌永权. 一种关于序列二次规划和lp罚函数的推论与证明[J]. 广东工业大学学报, 2016, 33(5): 1-4, 8. DOI: 10.3969/j.issn.1007-7162.2016.05.001.
Miao Qing, Tang Song, Ling Bingo Wing-Kuen. A Corollary and Proof about Sequential Quadratic Programming and lp Penalty Function[J]. Journal of Guangdong University of Technology, 2016, 33(5): 1-4, 8. DOI: 10.3969/j.issn.1007-7162.2016.05.001.

基金项目:

国家自然科学基金资助项目(61372173)

作者简介:

苗晴(1982-), 女,讲师,博士研究生,主要研究方向为优化方法与信号处理等。

通信作者

凌永权(1973-), 男,国家“青年千人计划”入选者,广东工业大学“百人计划”特聘教授,博士生导师,主要研究方向为最优化信号处理与时频分析等.E-mail:yongquanling@gdut.edu.cn

文章历史

收稿日期:2016-04-18
一种关于序列二次规划和lp罚函数的推论与证明
苗晴1,2, 唐颂3, 凌永权1    
1. 广东工业大学 信息工程学院,广东 广州 510006;
2. 佛山科学技术学院 数学与大数据学院, 广东 佛山 528000;
3. 南开大学 数学学院, 天津 300071
摘要: 针对一般的含有不等式和等式约束的非线性优化问题,给出了一个关于序列二次规划和lp罚函数的推论与证明.推导了当取相应的二次规划子问题的解作为搜索方向时,则lp罚函数沿该搜索方向的方向导数满足一定的不等式条件;同时通过确定罚参数的取值范围,证明了该搜索方向是lp罚函数在原问题处的下降方向.
关键词: 非线性约束优化    序列二次规划    罚函数    方向导数    
A Corollary and Proof about Sequential Quadratic Programming and lp Penalty Function
Miao Qing1,2, Tang Song3, Ling Bingo Wing-Kuen1    
1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China;
2. School of Mathematics and Big Data, Foshan University, Foshan 528000, China;
3. School of Mathematical Sciences, Nankai University, Tianjin 300071, China
Abstract: For a general nonlinear optimization problem with both inequality and equality constraints, a corollary on the optimization problem with a lp penalty function is studied and proved based on the sequential quadratic programming approach. In particular, the corollary defines the inequality condition for the search direction of the optimization problem. Moreover, it proves that the derived search direction is a descent direction of the penalty function in the original problem by determining the range of penalty parameter.
Key words: nonlinearly constrained optimization    sequential quadratic programming    penalty function    directional derivative    

最优化理论与方法是一门应用性很强的学科,在国防经济、工农业生产、交通运输、工程设计和金融贸易等领域有着广泛的应用.非线性约束优化是最优化方法中重要的组成部分,它的一般形式可表述为[1-3].

$\begin{array}{*{20}{l}} {\mathop {\min }\limits_x f\left( \mathit{\boldsymbol{x}} \right).}\\ {{\rm{s}}.{\rm{t}}.\left\{ {\begin{array}{*{20}{l}} {{g_i}\left( \mathit{\boldsymbol{x}} \right) \ge 0,i \in I = \left\{ {1,2, \cdot \cdot \cdot ,{m_e}} \right\};}\\ {{h_j}\left( \mathit{\boldsymbol{x}} \right) = 0,j \in J = \left\{ {{m_e} + 1,{m_e} + 2, \cdot \cdot \cdot ,m} \right\}.} \end{array}} \right.} \end{array}$ (1)

其中,$ f\left( \mathit{\boldsymbol{{x}}} \right), {g_i}\left( \mathit{\boldsymbol{{x}}} \right) \ge 0, i \in I = \left\{ {1, 2, \cdot \cdot \cdot, {m_e}} \right\} $$ {h_j}\left( \mathit{\boldsymbol{{x}}} \right) = 0, j \in J = \left\{ {{m_e} + 1, {m_e} + 2, \cdot \cdot \cdot, m} \right\}; $R nR至少一个为非线性函数,这里xRn, m > me.

常见的求解非线性约束优化问题的方法有Lagrange乘子法[4]、罚函数法[5]和序列二次规划方法[6]等.其中序列二次规划方法(SQP方法)是一类重要的求解非线性约束优化问题的方法,它是在Wilson-Han-Powell(WHP)方法的基础上发展的[1].随后SQP方法引起了许多学者的极大兴趣,得到了进一步的发展并取得大量的研究成果,该类算法的研究也一直是非线性领域的一个热点[7-16].

在非线性约束优化中,WHP方法把求解上述的非线性约束优化问题转化为相应的二次规划子问题,用l1精确罚函数作为效益函数来确定步长,并通过对罚函数中参数的选取及二次规划目标函数矩阵的修正,建立了一个快速有效的算法[1, 5].算法中证明了当罚参数满足一定条件时,用二次规划子问题的解作为原问题变量在迭代过程中的搜索方向,则它是该l1罚函数的下降方向.本文受WHP方法的启发,在基于l1精确罚函数的SQP方法的基础上,把l1罚函数给予推广,构造了lp罚函数作为效益函数(这里的p满足p > 1),并给出关于SQP算法中lp罚函数的相关的推论和严格证明.

1 SQP算法的l1精确罚函数

对于一般的非线性约束优化问题(1),假设$ f\left( \mathit{\boldsymbol{{x}}} \right), {g_i}\left( \mathit{\boldsymbol{{x}}} \right) \ge 0, i \in I = \left\{ {1, 2, \cdot \cdot \cdot, {m_e}} \right\} $$ {h_j}\left( \mathit{\boldsymbol{{x}}} \right) = 0, j \in J = \left\{ {{m_e} + 1, {m_e} + 2, \cdot \cdot \cdot, m} \right\} $(这里xR n, m > me)二次连续可微,把它转化为相应的二次规划子问题形式如下:

$ \mathop {\min }\limits_d \frac{1}{2}2{\mathit{\boldsymbol{d}}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}\mathit{\boldsymbol{d}} + \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( {\bf{k}} \right)}}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}.$
${\rm{s}}.{\rm{t}}.\left\{ {\begin{array}{*{20}{l}} {\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}\mathit{\boldsymbol{d}} + {\mathit{\boldsymbol{g}}_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \ge 0,i \in I;}\\ {\nabla {h_j}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}\mathit{\boldsymbol{d}}{\rm{ + }}{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = 0,j \in J.} \end{array}} \right. $ (2)

其中$ I = \left\{ {1, 2, \cdot \cdot \cdot, {m_e}} \right\}, {J = }\left\{ {{m_e} + 1, {m_e} + 2, \cdot \cdot \cdot, m} \right\} $dR nB kR n×n是问题(1) Lagrange函数的Hessian阵的近似.

d (k)为子问题(2) 的解,则SQP方法是用d(k)作为搜索方向,记λ(k+1)为问题(2) 的Lagrange乘子,故由KKT条件,得到:

$ \begin{array}{*{20}{l}} {{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \nabla f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = \nabla {g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right){\mathit{\boldsymbol{\lambda }}_I}^{\left( {k + 1} \right)} + }\\ {\nabla {h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right){\mathit{\boldsymbol{\lambda }}_J}^{\left( {k + 1} \right)}.} \end{array} $ (3)
$ {\mathit{\boldsymbol{\lambda }}_i}^{\left( {k + 1} \right)} \ge 0\;且\;{\mathit{\boldsymbol{\lambda }}_i}^{\left( {k + 1} \right)}\left[{\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right] = 0, i \in I. $ (4)
$ \nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}{\rm{ + }}{\mathit{h}_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = 0, j \in J.$ (5)
$ \nabla {g_i}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \ge 0, i \in I.$ (6)

l1罚函数

$ \begin{array}{*{20}{l}} {{\varphi _1}\left( {\mathit{\boldsymbol{x}},\mu } \right) = f\left( \mathit{\boldsymbol{x}} \right) + }\\ {\mu \left[ {{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right\|}_1} + {{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_1}} \right] = f\left( \mathit{\boldsymbol{x}} \right) + \;}\\ {\mu \left[ {\sum\limits_{i \in I} {\left| {\min \left\{ {{g_i}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right| + \sum\limits_{j \in J} {\left| {{h_j}\left( \mathit{\boldsymbol{x}} \right)} \right|} } } \right].} \end{array}$ (7)

同时记罚函数φ1(x (k), μ)沿d (k)方向的方向导数为D(φ1(x (k), μ); d (k)),则它满足

$ \begin{array}{l} D\left( {{\varphi _1}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}\mathit{, }\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le- {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}- \\ \mu \left[{{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right), 0} \right\}} \right\|}_1} + {{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_1}} \right] -\\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) -\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right). \end{array}$ (8)

并且,若d (k) T B k d (k) > 0,μ满足μ > ‖ λ (k+1),则d (k)是罚函数在x (k)处的下降方向.

2 SQP算法的lp罚函数

自从基于l1罚函数的SQP方法提出以来,随着算法自身理论的不断完善,它已成为求解中小规模的约束优化问题中最受欢迎的方法之一.本文在基于l1精确罚函数的基础上,把SQP算法中的l1罚函数给予推广,构造了SQP算法的lp罚函数作为效益函数(这里的p满足p > 1即可).在lp罚函数中得出,若采用相应的二次规划子问题的解作为搜索方向时,则罚函数沿该搜索方向的方向导数满足一定的条件;当lp罚函数的参数取值范围确定时,该搜索方向是lp罚函数在原问题处的下降方向.推导与证明如下:

设(d (k), λ (k+1))是子问题(2) 的解,记lp罚函数(p > 1) 为

$ \begin{array}{*{20}{l}} {{\varphi _p}\left( {\mathit{\boldsymbol{x}},\mu } \right) = f\left( \mathit{\boldsymbol{x}} \right) + \mu \left[ {{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right\|}_p}} \right. + }\\ {\left. {{{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_p}} \right] = f\left( \mathit{\boldsymbol{x}} \right) + \mu \left[ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} + {{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( \mathit{\boldsymbol{x}} \right)} \right|}^{\frac{1}{p}}}} } \right)}^{\frac{1}{p}}}} \right].} \end{array}$ (9)

则可以得到罚函数φp(x (k), μ)沿d (k)方向的方向导数满足

$ \begin{align} & D\left( {{\varphi }_{p}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}}, \mu \right);{{\mathit{\boldsymbol{d}}}^{\left( k \right)}} \right)\le-{{\mathit{\boldsymbol{d}}}^{\left( k \right)\rm{T}}}{{B}_{k}}{{\mathit{\boldsymbol{d}}}^{\left( k \right)}}- \\ & \mu \left[{{\left\| \min \left\{ {{g}_{I}}\left( \mathit{\boldsymbol{x}} \right), 0 \right\} \right\|}_{p}}+{{\left\| {{h}_{J}}\left( \mathit{\boldsymbol{x}} \right) \right\|}_{p}} \right]-\\ & \lambda _{I}^{\left( k+1 \right)\rm{T}}{{g}_{I}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}} \right)-\lambda _{J}^{\left( k+1 \right)\rm{T}}{{h}_{J}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}} \right). \\ \end{align}$ (10)

并且,若d (k)T Bkd(k) > 0,且罚参数μ若满足$ \mu \ge {\left\| {\mathit{\boldsymbol{\lambda }}{^{\left( {k + 1} \right)}}} \right\|_q} $(这里的q > 1且满足$ \frac{1}{p}+\frac{1}{q}=1 $$ {\left\| {{\mathit{\boldsymbol{\lambda }}^{\left( {k + 1} \right)}}} \right\|_q} = {\left( {\sum\limits_{i = 1}^m {{{\left| {{\mathit{\boldsymbol{\lambda }}_i}} \right|}^q}} } \right)^{\frac{1}{q}}}$),则d (k)lp罚函数在x (k)处的下降方向.

证明  记罚函数φp(x(k), μ)沿d (k)方向的方向导数为D(φp(x (k), μ); d (k)),则得

$\begin{array}{*{20}{l}} {D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right){\rm{ = }}}\\ {\mathop {\lim }\limits_{t \to 0 + } \frac{{{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},\mu } \right) - {\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right)}}{t} = }\\ {\mathop {\lim }\limits_{t \to 0 + } \frac{{f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},\mu } \right) - f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right)}}{t} + }\\ {\mu \cdot \mathop {\lim }\limits_{t \to 0 + } \left\{ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - } \right.}\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p} - }\\ {\left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right\}/t. = \nabla f{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \mu \times }\\ {{{\mathop {\lim }\limits_{t \to 0 + } }^{ - 1}}\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - } \right.}\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p} - }\\ {\left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right).} \end{array}$ (11)

由KKT条件(3),可得到

$ \begin{array}{l} \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} =-{\mathit{\boldsymbol{d}}^{\left( k \right)}}^{\rm{T}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}\nabla {g_I}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \lambda _J^{\left( {k + 1} \right){\rm{T}}}\nabla {h_J}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}. \end{array}$ (12)

又由KKT条件(4) 和(5),则式(12) 即可化为

$ \begin{array}{l} \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} =-{\mathit{\boldsymbol{d}}^{\left( k \right)}}^{\rm{T}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}-\\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right). \end{array}$ (13)

对于∀ t∈(0, δ), 0 < δ < 1,iI,由中值定理得

$ \begin{array}{*{20}{l}} {min\left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\} = }\\ {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\nabla {g_i}{{({\mathit{\boldsymbol{x}}^{\left( k \right)}})}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right\} + o\left( t \right) = }\\ {\min \left\{ {\left( {1 - t} \right)} \right.{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\left( {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right) + }\\ {\left. {\nabla {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right)} \right\} + o\left( t \right) \ge \left( {1 - } \right.}\\ {\left. t \right)\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\} + t\min \left\{ {{g_i}} \right.\left( {{{\bf{x}}^{\left( k \right)}}} \right) + }\\ {\left. {\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right\} + o\left( t \right) = }\\ {\left( {1 - t} \right)\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\} + o\left( t \right).} \end{array}$ (14)
$\begin{array}{*{20}{l}} {由此可得\;\;{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right)} \right\}} \right|}^p} \le \left( {1 - } \right.}\\ {{{\left. t \right)}^p}{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p},\left( {p > 1} \right).} \end{array} $ (15)

所以可得

$ \begin{array}{*{20}{l}} {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - }\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} = }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} - }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} \le }\\ {{{\left( {\sum\limits_{i \in I} {{{\left( {1 - t} \right)}^p}{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} - }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} = }\\ {\left( {1 - t - 1} \right) \cdot \sum\limits_{i \in I} {{{\left( {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} \right)}^{\frac{1}{p}}}} = }\\ {\begin{array}{*{20}{l}} { - t \cdot {{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p}.}\\ {} \end{array}} \end{array} $ (16)

同理,对于∀ t∈(0, δ), 0 < δ < 1,jJ,由中值定理得

$ \begin{array}{l} {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) = {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ o\left( t \right) = {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-t{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \\ t\nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + o\left( t \right) = \left( {1-t} \right){h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \\ t\left( {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \nabla {h_j}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) + o\left( t \right) = \left( {1-} \right.\\ \left. t \right){h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + o\left( t \right). \end{array} $ (17)

所以可得

$\begin{array}{l} {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p}-{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p} = \\ {\left( {\sum\limits_{j \in J} {{{\left( {1-t} \right)}^p}{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}}-{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} = \\ \left( {1 - t - 1} \right) \cdot {\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} = \\ - t \cdot {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p}. \end{array} $ (18)

由式(16) 和(18) 得

$ \begin{array}{l} {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right), 0} \right\}} \right\|_p}-\\ {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|_p} + {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p}-\\ {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p} \le-t \cdot {\left( {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|} \right._p} + \\ \left. {{{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ (19)

从而,由式(11) 得,方向导数

$ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) = \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ \mu \cdot \mathop {\lim }\limits_{t \to 0 + } {t^{ - 1}}\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p}} \right. - \\ {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|_p} + {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p} - \\ \left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right) \le \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \mu \times {\lim _{t \to 0 + }}{t^{ - 1}}\\ \left( { - t} \right)\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) = \\ \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \mu \cdot {\left( {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|} \right._p} + \\ \left. {{{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ (20)

再由式(13),可得

$ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \\ \mathit{\boldsymbol{\lambda }}_I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mathit{\boldsymbol{\lambda }}_J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right), \end{array} $

即证不等式(10) 成立.

进一步,由于对于iI$ \mathit{\boldsymbol{\lambda }}_i^{\left( {k + 1} \right)} \ge 0 $,所以,

$ \begin{array}{l}-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = \\-\sum\limits_{i \in I} {\lambda _i^{\left( {k + 1} \right){\rm{T}}}{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} - \sum\limits_{j \in J} {\lambda _j^{\left( {k + 1} \right){\rm{T}}}{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \le \\ \sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right){}}} \right| \cdot } \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right| + \\ \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right){}}} \right|\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} . \end{array} $ (21)

p > 1,q > 1且满足$ \frac{1}{p}+\frac{1}{q}=1 $,则由Höder不等式知

$ \begin{array}{*{20}{l}} {\sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} \cdot \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right| \le }\\ {{{\left( {{{\sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} }^q}} \right)}^{\frac{1}{q}}} \cdot {{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}},} \end{array}$ (22)
$ \begin{array}{l} \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right| \cdot \left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} \le \\ {\left( {{{\sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right|} }^q}} \right)^{\frac{1}{q}}}{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}}, \end{array}$ (23)

因此,由(21) 式可得

$\begin{array}{l}-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \le \\ \sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} \cdot \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right| + \\ \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right| \cdot \left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} \le {\left( {\sum\limits_{i \in I} {{{\left| {\lambda _i^{\left( {k + 1} \right)}} \right|}^q}} } \right)^{\frac{1}{q}}} \times \\ {\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right|}^p}} } \right)^{\frac{1}{p}}} + {\left( {\sum\limits_{j \in J} {{{\left| {\lambda _j^{\left( {k + 1} \right)}} \right|}^q}} } \right)^{\frac{1}{q}}} \times \\ {\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} \le {\left\| {{\lambda ^{\left( {k + 1} \right)}}} \right\|_q} \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right).\\ \end{array} $ (24)

从而,方向导数

$ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}, \mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le-{\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}-\\ \mathit{\boldsymbol{\lambda }}_I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\mathit{\boldsymbol{\lambda }}_J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) \le \\ - {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {\left\| {{\mathit{\boldsymbol{\lambda }}^{\left( {k + 1} \right)}}} \right\|_q} \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) = \\ - {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \left( {\mu - {{\left\| {{\mathit{\boldsymbol{\lambda }}^{\left( {k + 1} \right)}}} \right\|}_q}} \right) \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ ((25))

所以,若d(k) T B kd (k) > 0,且当μ ≥ ‖ λ (k+1)q时,有D(φp(x (k), μ); d (k)) < 0,即证d (k)lp罚函数在x (k)处的下降方向.

3 结束语

本文针对一般的含有不等式和等式约束的非线性优化问题,在Wilson-Han-Powell方法的基础上,结合SQP算法和罚函数的思想,把SQP方法中的l1罚函数给予推广,构造了lp罚函数作为效益函数.严谨推导了若取相应的二次规划子问题的解d(k)作为搜索方向时,则lp罚函数沿该搜索方向的方向导数满足一定的不等式条件;同时通过对lp罚函数中罚参数μ的选取范围的确定,严格证明了该搜索方向是对应的lp罚函数在原问题x(k)处的下降方向.

参考文献
[1] 袁亚湘, 孙文瑜. 最优化理论与方法[M]. 北京: 科学出版社, 1997.
[2] 孙文瑜, 徐成贤, 朱德通. 最优化方法[M]. 2版. 北京: 高等教育出版社, 2010.
[3] 李董辉, 童小娇, 万中. 数值最优化算法与理论[M]. 2版. 北京: 科学出版社, 2010.
[4] DIPILLO G, GRIPPO L. A new augmented Lagrangian function for inequality constraints in nonlinear programming problems[J]. Journal of Optimization Theory and Applications, 1982, 36(4): 495-519. DOI: 10.1007/BF00940544.
[5] HANSP S P, MANGASARIANOL O L. Exact penalty functions in nonlinear programming[J]. Mathematical Programming, 1979, 17(1): 251-269. DOI: 10.1007/BF01588250.
[6] POWELLMJ D, YUAN Y. A recursive quadratic programming algorithm that uses differentiable exact penalty functions[J]. Mathematical Programming, 1986, 35(3): 265-278. DOI: 10.1007/BF01580880.
[7] HANSP S P. A globally convergent method for nonlinear programming[J]. Journal of Optimization Theory and Applications, 1977, 22(3): 297-309. DOI: 10.1007/BF00932858.
[8] PANTOJAJFAD O, MAYNEDQ D Q. Exact penalty function algorithm with simple updating of the penalty parameter[J]. Journal of Optimization Theory and Applications, 1991, 69(3): 441-467. DOI: 10.1007/BF00940684.
[9] FLETCHER R, LEYFFER S. Nonlinear programming without a penalty function[J]. Mathematical Programming, 2002, 91(2): 239-269. DOI: 10.1007/s101070100244.
[10] GILLPE P E, MURRAY W, SAUNDERSMA M A. SNOPT: An SQP algorithm for large-scale constrained optimization[J]. SIAM Journal on Optimization, 2002, 12(4): 979-1006. DOI: 10.1137/S1052623499350013.
[11] JIANJB J B. New sequential quadratically-constrained quadratic programming method of feasible directions and its convergence rate[J]. Journal of Optimization Theory and Applications, 2006, 129(1): 109-130. DOI: 10.1007/s10957-006-9042-7.
[12] JIN Z, WANGYQ Y Q. A simple feasible SQP method for inequality constrained optimization with global and superlinear convergence[J]. Journal of Computational and Applied Mathematics, 2010, 233(11): 3060-3073. DOI: 10.1016/j.cam.2009.11.061.
[13] TANGCM C M, JIANJB J B, LI G Y. A working set SQCQP algorithm with simple nonmonotone penalty parameters[J]. Journal of Computational and Applied Mathematics, 2011, 236(6): 1382-1398. DOI: 10.1016/j.cam.2011.09.002.
[14] JIANJB J B, Li J, ZHENGHY H Y, LI J L. A superlinearly convergent norm-relaxed method of quasi-strongly sub-feasible direction for inequality constrained minimax problems[J]. Applied Mathematics and Computation, 2014, 226: 673-690. DOI: 10.1016/j.amc.2013.10.082.
[15] JIANJB J B, CHENQF Q F, HUANGZW Z W. A superlinearly convergent SQP method without boundedness assumptions on any of the iterative sequences[J]. Journal of Computational and Applied Mathematics, 2014, 263: 115-128. DOI: 10.1016/j.cam.2013.12.001.
[16] JIANJB J B, GUOCH C H, TANGCM C M, BAIYQ Y Q. A new superlinearly convergent algorithm of combining QP subproblem with system of linear equations for nonlinear optimization[J]. Journal of Computational and Applied Mathematics, 2015, 273: 88-102. DOI: 10.1016/j.cam.2014.06.009.