2. 佛山科学技术学院 数学与大数据学院, 广东 佛山 528000;
3. 南开大学 数学学院, 天津 300071
2. School of Mathematics and Big Data, Foshan University, Foshan 528000, China;
3. School of Mathematical Sciences, Nankai University, Tianjin 300071, China
最优化理论与方法是一门应用性很强的学科,在国防经济、工农业生产、交通运输、工程设计和金融贸易等领域有着广泛的应用.非线性约束优化是最优化方法中重要的组成部分,它的一般形式可表述为[1-3].
| $\begin{array}{*{20}{l}} {\mathop {\min }\limits_x f\left( \mathit{\boldsymbol{x}} \right).}\\ {{\rm{s}}.{\rm{t}}.\left\{ {\begin{array}{*{20}{l}} {{g_i}\left( \mathit{\boldsymbol{x}} \right) \ge 0,i \in I = \left\{ {1,2, \cdot \cdot \cdot ,{m_e}} \right\};}\\ {{h_j}\left( \mathit{\boldsymbol{x}} \right) = 0,j \in J = \left\{ {{m_e} + 1,{m_e} + 2, \cdot \cdot \cdot ,m} \right\}.} \end{array}} \right.} \end{array}$ | (1) |
其中,
常见的求解非线性约束优化问题的方法有Lagrange乘子法[4]、罚函数法[5]和序列二次规划方法[6]等.其中序列二次规划方法(SQP方法)是一类重要的求解非线性约束优化问题的方法,它是在Wilson-Han-Powell(WHP)方法的基础上发展的[1].随后SQP方法引起了许多学者的极大兴趣,得到了进一步的发展并取得大量的研究成果,该类算法的研究也一直是非线性领域的一个热点[7-16].
在非线性约束优化中,WHP方法把求解上述的非线性约束优化问题转化为相应的二次规划子问题,用l1精确罚函数作为效益函数来确定步长,并通过对罚函数中参数的选取及二次规划目标函数矩阵的修正,建立了一个快速有效的算法[1, 5].算法中证明了当罚参数满足一定条件时,用二次规划子问题的解作为原问题变量在迭代过程中的搜索方向,则它是该l1罚函数的下降方向.本文受WHP方法的启发,在基于l1精确罚函数的SQP方法的基础上,把l1罚函数给予推广,构造了lp罚函数作为效益函数(这里的p满足p > 1),并给出关于SQP算法中lp罚函数的相关的推论和严格证明.
1 SQP算法的l1精确罚函数对于一般的非线性约束优化问题(1),假设
| $ \mathop {\min }\limits_d \frac{1}{2}2{\mathit{\boldsymbol{d}}^{\rm{T}}}{\mathit{\boldsymbol{B}}_k}\mathit{\boldsymbol{d}} + \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( {\bf{k}} \right)}}} \right)^{\rm{T}}}\mathit{\boldsymbol{d}}.$ |
| ${\rm{s}}.{\rm{t}}.\left\{ {\begin{array}{*{20}{l}} {\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}\mathit{\boldsymbol{d}} + {\mathit{\boldsymbol{g}}_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \ge 0,i \in I;}\\ {\nabla {h_j}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}\mathit{\boldsymbol{d}}{\rm{ + }}{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = 0,j \in J.} \end{array}} \right. $ | (2) |
其中
记d (k)为子问题(2) 的解,则SQP方法是用d(k)作为搜索方向,记λ(k+1)为问题(2) 的Lagrange乘子,故由KKT条件,得到:
| $ \begin{array}{*{20}{l}} {{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \nabla f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = \nabla {g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right){\mathit{\boldsymbol{\lambda }}_I}^{\left( {k + 1} \right)} + }\\ {\nabla {h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right){\mathit{\boldsymbol{\lambda }}_J}^{\left( {k + 1} \right)}.} \end{array} $ | (3) |
| $ {\mathit{\boldsymbol{\lambda }}_i}^{\left( {k + 1} \right)} \ge 0\;且\;{\mathit{\boldsymbol{\lambda }}_i}^{\left( {k + 1} \right)}\left[{\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right] = 0, i \in I. $ | (4) |
| $ \nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}{\rm{ + }}{\mathit{h}_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = 0, j \in J.$ | (5) |
| $ \nabla {g_i}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \ge 0, i \in I.$ | (6) |
记l1罚函数
| $ \begin{array}{*{20}{l}} {{\varphi _1}\left( {\mathit{\boldsymbol{x}},\mu } \right) = f\left( \mathit{\boldsymbol{x}} \right) + }\\ {\mu \left[ {{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right\|}_1} + {{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_1}} \right] = f\left( \mathit{\boldsymbol{x}} \right) + \;}\\ {\mu \left[ {\sum\limits_{i \in I} {\left| {\min \left\{ {{g_i}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right| + \sum\limits_{j \in J} {\left| {{h_j}\left( \mathit{\boldsymbol{x}} \right)} \right|} } } \right].} \end{array}$ | (7) |
同时记罚函数φ1(x (k), μ)沿d (k)方向的方向导数为D(φ1(x (k), μ); d (k)),则它满足
| $ \begin{array}{l} D\left( {{\varphi _1}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}\mathit{, }\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le- {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}- \\ \mu \left[{{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right), 0} \right\}} \right\|}_1} + {{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_1}} \right] -\\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) -\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right). \end{array}$ | (8) |
并且,若d (k) T B k d (k) > 0,μ满足μ > ‖ λ (k+1) ‖∞,则d (k)是罚函数在x (k)处的下降方向.
2 SQP算法的lp罚函数自从基于l1罚函数的SQP方法提出以来,随着算法自身理论的不断完善,它已成为求解中小规模的约束优化问题中最受欢迎的方法之一.本文在基于l1精确罚函数的基础上,把SQP算法中的l1罚函数给予推广,构造了SQP算法的lp罚函数作为效益函数(这里的p满足p > 1即可).在lp罚函数中得出,若采用相应的二次规划子问题的解作为搜索方向时,则罚函数沿该搜索方向的方向导数满足一定的条件;当lp罚函数的参数取值范围确定时,该搜索方向是lp罚函数在原问题处的下降方向.推导与证明如下:
设(d (k), λ (k+1))是子问题(2) 的解,记lp罚函数(p > 1) 为
| $ \begin{array}{*{20}{l}} {{\varphi _p}\left( {\mathit{\boldsymbol{x}},\mu } \right) = f\left( \mathit{\boldsymbol{x}} \right) + \mu \left[ {{{\left\| {\min \left\{ {{g_I}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right\|}_p}} \right. + }\\ {\left. {{{\left\| {{h_J}\left( \mathit{\boldsymbol{x}} \right)} \right\|}_p}} \right] = f\left( \mathit{\boldsymbol{x}} \right) + \mu \left[ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( \mathit{\boldsymbol{x}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} + {{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( \mathit{\boldsymbol{x}} \right)} \right|}^{\frac{1}{p}}}} } \right)}^{\frac{1}{p}}}} \right].} \end{array}$ | (9) |
则可以得到罚函数φp(x (k), μ)沿d (k)方向的方向导数满足
| $ \begin{align} & D\left( {{\varphi }_{p}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}}, \mu \right);{{\mathit{\boldsymbol{d}}}^{\left( k \right)}} \right)\le-{{\mathit{\boldsymbol{d}}}^{\left( k \right)\rm{T}}}{{B}_{k}}{{\mathit{\boldsymbol{d}}}^{\left( k \right)}}- \\ & \mu \left[{{\left\| \min \left\{ {{g}_{I}}\left( \mathit{\boldsymbol{x}} \right), 0 \right\} \right\|}_{p}}+{{\left\| {{h}_{J}}\left( \mathit{\boldsymbol{x}} \right) \right\|}_{p}} \right]-\\ & \lambda _{I}^{\left( k+1 \right)\rm{T}}{{g}_{I}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}} \right)-\lambda _{J}^{\left( k+1 \right)\rm{T}}{{h}_{J}}\left( {{\mathit{\boldsymbol{x}}}^{\left( k \right)}} \right). \\ \end{align}$ | (10) |
并且,若d (k)T Bkd(k) > 0,且罚参数μ若满足
证明 记罚函数φp(x(k), μ)沿d (k)方向的方向导数为D(φp(x (k), μ); d (k)),则得
| $\begin{array}{*{20}{l}} {D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right){\rm{ = }}}\\ {\mathop {\lim }\limits_{t \to 0 + } \frac{{{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},\mu } \right) - {\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right)}}{t} = }\\ {\mathop {\lim }\limits_{t \to 0 + } \frac{{f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},\mu } \right) - f\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right)}}{t} + }\\ {\mu \cdot \mathop {\lim }\limits_{t \to 0 + } \left\{ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - } \right.}\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p} - }\\ {\left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right\}/t. = \nabla f{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \mu \times }\\ {{{\mathop {\lim }\limits_{t \to 0 + } }^{ - 1}}\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - } \right.}\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p} - }\\ {\left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right).} \end{array}$ | (11) |
由KKT条件(3),可得到
| $ \begin{array}{l} \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} =-{\mathit{\boldsymbol{d}}^{\left( k \right)}}^{\rm{T}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}\nabla {g_I}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \lambda _J^{\left( {k + 1} \right){\rm{T}}}\nabla {h_J}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}. \end{array}$ | (12) |
又由KKT条件(4) 和(5),则式(12) 即可化为
| $ \begin{array}{l} \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} =-{\mathit{\boldsymbol{d}}^{\left( k \right)}}^{\rm{T}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}-\\ \lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right). \end{array}$ | (13) |
对于∀ t∈(0, δ), 0 < δ < 1,i∈I,由中值定理得
| $ \begin{array}{*{20}{l}} {min\left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\} = }\\ {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\nabla {g_i}{{({\mathit{\boldsymbol{x}}^{\left( k \right)}})}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right\} + o\left( t \right) = }\\ {\min \left\{ {\left( {1 - t} \right)} \right.{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\left( {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right) + }\\ {\left. {\nabla {g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right)} \right\} + o\left( t \right) \ge \left( {1 - } \right.}\\ {\left. t \right)\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\} + t\min \left\{ {{g_i}} \right.\left( {{{\bf{x}}^{\left( k \right)}}} \right) + }\\ {\left. {\nabla {g_i}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right\} + o\left( t \right) = }\\ {\left( {1 - t} \right)\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\} + o\left( t \right).} \end{array}$ | (14) |
| $\begin{array}{*{20}{l}} {由此可得\;\;{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}},0} \right)} \right\}} \right|}^p} \le \left( {1 - } \right.}\\ {{{\left. t \right)}^p}{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p},\left( {p > 1} \right).} \end{array} $ | (15) |
所以可得
| $ \begin{array}{*{20}{l}} {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} - }\\ {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} = }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} - }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} \le }\\ {{{\left( {\sum\limits_{i \in I} {{{\left( {1 - t} \right)}^p}{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} - }\\ {{{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}} = }\\ {\left( {1 - t - 1} \right) \cdot \sum\limits_{i \in I} {{{\left( {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} \right)}^{\frac{1}{p}}}} = }\\ {\begin{array}{*{20}{l}} { - t \cdot {{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p}.}\\ {} \end{array}} \end{array} $ | (16) |
同理,对于∀ t∈(0, δ), 0 < δ < 1,j∈J,由中值定理得
| $ \begin{array}{l} {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) = {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t\nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ o\left( t \right) = {h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-t{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + t{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \\ t\nabla {h_j}{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + o\left( t \right) = \left( {1-t} \right){h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \\ t\left( {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + \nabla {h_j}{{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)}^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) + o\left( t \right) = \left( {1-} \right.\\ \left. t \right){h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) + o\left( t \right). \end{array} $ | (17) |
所以可得
| $\begin{array}{l} {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p}-{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p} = \\ {\left( {\sum\limits_{j \in J} {{{\left( {1-t} \right)}^p}{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}}-{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} = \\ \left( {1 - t - 1} \right) \cdot {\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} = \\ - t \cdot {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p}. \end{array} $ | (18) |
由式(16) 和(18) 得
| $ \begin{array}{l} {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right), 0} \right\}} \right\|_p}-\\ {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|_p} + {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p}-\\ {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|_p} \le-t \cdot {\left( {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|} \right._p} + \\ \left. {{{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ | (19) |
从而,由式(11) 得,方向导数
| $ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) = \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \\ \mu \cdot \mathop {\lim }\limits_{t \to 0 + } {t^{ - 1}}\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p}} \right. - \\ {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|_p} + {\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|_p} - \\ \left. {{{\left\| {{h_J}{{\left( \mathit{\boldsymbol{x}} \right)}^{\left( k \right)}}} \right\|}_p}} \right) \le \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + \mu \times {\lim _{t \to 0 + }}{t^{ - 1}}\\ \left( { - t} \right)\left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}} + t{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) = \\ \nabla f{\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)^{\rm{T}}}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \mu \cdot {\left( {\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|} \right._p} + \\ \left. {{{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ | (20) |
再由式(13),可得
| $ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}},\mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \\ \mathit{\boldsymbol{\lambda }}_I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mathit{\boldsymbol{\lambda }}_J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right), \end{array} $ |
即证不等式(10) 成立.
进一步,由于对于i∈I,
| $ \begin{array}{l}-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) = \\-\sum\limits_{i \in I} {\lambda _i^{\left( {k + 1} \right){\rm{T}}}{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} - \sum\limits_{j \in J} {\lambda _j^{\left( {k + 1} \right){\rm{T}}}{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \le \\ \sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right){}}} \right| \cdot } \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right| + \\ \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right){}}} \right|\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} . \end{array} $ | (21) |
设p > 1,q > 1且满足
| $ \begin{array}{*{20}{l}} {\sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} \cdot \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right| \le }\\ {{{\left( {{{\sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} }^q}} \right)}^{\frac{1}{q}}} \cdot {{\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right),0} \right\}} \right|}^p}} } \right)}^{\frac{1}{p}}},} \end{array}$ | (22) |
| $ \begin{array}{l} \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right| \cdot \left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} \le \\ {\left( {{{\sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right|} }^q}} \right)^{\frac{1}{q}}}{\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}}, \end{array}$ | (23) |
因此,由(21) 式可得
| $\begin{array}{l}-\lambda _I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\lambda _J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) \le \\ \sum\limits_{i \in I} {\left| {\lambda _i^{\left( {k + 1} \right)}} \right|} \cdot \left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right| + \\ \sum\limits_{j \in J} {\left| {\lambda _j^{\left( {k + 1} \right)}} \right| \cdot \left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|} \le {\left( {\sum\limits_{i \in I} {{{\left| {\lambda _i^{\left( {k + 1} \right)}} \right|}^q}} } \right)^{\frac{1}{q}}} \times \\ {\left( {\sum\limits_{i \in I} {{{\left| {\min \left\{ {{g_i}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right|}^p}} } \right)^{\frac{1}{p}}} + {\left( {\sum\limits_{j \in J} {{{\left| {\lambda _j^{\left( {k + 1} \right)}} \right|}^q}} } \right)^{\frac{1}{q}}} \times \\ {\left( {\sum\limits_{j \in J} {{{\left| {{h_j}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right|}^p}} } \right)^{\frac{1}{p}}} \le {\left\| {{\lambda ^{\left( {k + 1} \right)}}} \right\|_q} \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right).\\ \end{array} $ | (24) |
从而,方向导数
| $ \begin{array}{l} D\left( {{\varphi _p}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}, \mu } \right);{\mathit{\boldsymbol{d}}^{\left( k \right)}}} \right) \le-{\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}}-\\ \mathit{\boldsymbol{\lambda }}_I^{\left( {k + 1} \right){\rm{T}}}{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)-\mathit{\boldsymbol{\lambda }}_J^{\left( {k + 1} \right){\rm{T}}}{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) \le \\ - {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} + {\left\| {{\mathit{\boldsymbol{\lambda }}^{\left( {k + 1} \right)}}} \right\|_q} \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) - \mu \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right) = \\ - {\mathit{\boldsymbol{d}}^{\left( k \right){\rm{T}}}}{\mathit{\boldsymbol{B}}_k}{\mathit{\boldsymbol{d}}^{\left( k \right)}} - \left( {\mu - {{\left\| {{\mathit{\boldsymbol{\lambda }}^{\left( {k + 1} \right)}}} \right\|}_q}} \right) \times \\ \left( {{{\left\| {\min \left\{ {{g_I}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right), 0} \right\}} \right\|}_p} + {{\left\| {{h_J}\left( {{\mathit{\boldsymbol{x}}^{\left( k \right)}}} \right)} \right\|}_p}} \right). \end{array} $ | ((25)) |
所以,若d(k) T B kd (k) > 0,且当μ ≥ ‖ λ (k+1)‖q时,有D(φp(x (k), μ); d (k)) < 0,即证d (k)是lp罚函数在x (k)处的下降方向.
3 结束语本文针对一般的含有不等式和等式约束的非线性优化问题,在Wilson-Han-Powell方法的基础上,结合SQP算法和罚函数的思想,把SQP方法中的l1罚函数给予推广,构造了lp罚函数作为效益函数.严谨推导了若取相应的二次规划子问题的解d(k)作为搜索方向时,则lp罚函数沿该搜索方向的方向导数满足一定的不等式条件;同时通过对lp罚函数中罚参数μ的选取范围的确定,严格证明了该搜索方向是对应的lp罚函数在原问题x(k)处的下降方向.
| [1] | 袁亚湘, 孙文瑜. 最优化理论与方法[M]. 北京: 科学出版社, 1997. |
| [2] | 孙文瑜, 徐成贤, 朱德通. 最优化方法[M]. 2版. 北京: 高等教育出版社, 2010. |
| [3] | 李董辉, 童小娇, 万中. 数值最优化算法与理论[M]. 2版. 北京: 科学出版社, 2010. |
| [4] | DIPILLO G, GRIPPO L. A new augmented Lagrangian function for inequality constraints in nonlinear programming problems[J]. Journal of Optimization Theory and Applications, 1982, 36(4): 495-519. DOI: 10.1007/BF00940544. |
| [5] | HANSP S P, MANGASARIANOL O L. Exact penalty functions in nonlinear programming[J]. Mathematical Programming, 1979, 17(1): 251-269. DOI: 10.1007/BF01588250. |
| [6] | POWELLMJ D, YUAN Y. A recursive quadratic programming algorithm that uses differentiable exact penalty functions[J]. Mathematical Programming, 1986, 35(3): 265-278. DOI: 10.1007/BF01580880. |
| [7] | HANSP S P. A globally convergent method for nonlinear programming[J]. Journal of Optimization Theory and Applications, 1977, 22(3): 297-309. DOI: 10.1007/BF00932858. |
| [8] | PANTOJAJFAD O, MAYNEDQ D Q. Exact penalty function algorithm with simple updating of the penalty parameter[J]. Journal of Optimization Theory and Applications, 1991, 69(3): 441-467. DOI: 10.1007/BF00940684. |
| [9] | FLETCHER R, LEYFFER S. Nonlinear programming without a penalty function[J]. Mathematical Programming, 2002, 91(2): 239-269. DOI: 10.1007/s101070100244. |
| [10] | GILLPE P E, MURRAY W, SAUNDERSMA M A. SNOPT: An SQP algorithm for large-scale constrained optimization[J]. SIAM Journal on Optimization, 2002, 12(4): 979-1006. DOI: 10.1137/S1052623499350013. |
| [11] | JIANJB J B. New sequential quadratically-constrained quadratic programming method of feasible directions and its convergence rate[J]. Journal of Optimization Theory and Applications, 2006, 129(1): 109-130. DOI: 10.1007/s10957-006-9042-7. |
| [12] | JIN Z, WANGYQ Y Q. A simple feasible SQP method for inequality constrained optimization with global and superlinear convergence[J]. Journal of Computational and Applied Mathematics, 2010, 233(11): 3060-3073. DOI: 10.1016/j.cam.2009.11.061. |
| [13] | TANGCM C M, JIANJB J B, LI G Y. A working set SQCQP algorithm with simple nonmonotone penalty parameters[J]. Journal of Computational and Applied Mathematics, 2011, 236(6): 1382-1398. DOI: 10.1016/j.cam.2011.09.002. |
| [14] | JIANJB J B, Li J, ZHENGHY H Y, LI J L. A superlinearly convergent norm-relaxed method of quasi-strongly sub-feasible direction for inequality constrained minimax problems[J]. Applied Mathematics and Computation, 2014, 226: 673-690. DOI: 10.1016/j.amc.2013.10.082. |
| [15] | JIANJB J B, CHENQF Q F, HUANGZW Z W. A superlinearly convergent SQP method without boundedness assumptions on any of the iterative sequences[J]. Journal of Computational and Applied Mathematics, 2014, 263: 115-128. DOI: 10.1016/j.cam.2013.12.001. |
| [16] | JIANJB J B, GUOCH C H, TANGCM C M, BAIYQ Y Q. A new superlinearly convergent algorithm of combining QP subproblem with system of linear equations for nonlinear optimization[J]. Journal of Computational and Applied Mathematics, 2015, 273: 88-102. DOI: 10.1016/j.cam.2014.06.009. |
2016, Vol. 33

