«上一篇
 文章快速检索 高级检索

 智能系统学报  2021, Vol. 16 Issue (3): 416-424  DOI: 10.11992/tis.201911028 0

### 引用本文

XU Ziwei, CHEN Xiuhong. Sparse optimal mean principal component analysis based on self-paced learning[J]. CAAI Transactions on Intelligent Systems, 2021, 16(3): 416-424. DOI: 10.11992/tis.201911028.

### 文章历史

Sparse optimal mean principal component analysis based on self-paced learning
XU Ziwei , CHEN Xiuhong
School of Digital Media, Jiangnan University, Wuxi 214122, China
Abstract: Principal component analysis (PCA) can be referred to as an unsupervised dimensionality reduction approach. However, the existing methods do not consider the difference of samples and cannot jointly extract important information of samples, thus affecting the performance of some methods. For the above problems, based on self-paced learning, we proposed a sparse optimal mean PCA algorithm. In our model, loss of function is defined by ${L_{{\rm{2,1}}}}$ norm, the projection matrix is regularized by ${L_{{\rm{2,1}}}}$ norm, and the mean value is taken as a variable to be optimized in the iteration. In this way, important features can be consistently selected, and the robustness of the method to outliers can be improved. Considering the difference in training samples, we utilized self-paced learning mechanism to complete the learning process of training samples from “simple” to “complex” so as to effectively reduce the influence of outliers. Theoretical analysis and the empirical study revealed that the proposed method could effectively reduce the influence of noise or outliers on the classification progress, thus improving the effect of the classification.
Key words: image processing    principal component analysis    unsupervised learning    data dimension deduction    sparse    optimal mean    self-paced learning    face recognition

1 相关工作

1.1 PCA模型

 $\mathop {{\rm{min}}}\limits_{{W}} ||{{X}} - {{W}}{{{W}}^{\rm{T}}}{{X}}||_F^2,{\rm{s}}.{\rm{t}}.,{{{W}}^{\rm{T}}}{{W}} = {{I}}$ (1)

 $\mathop {\min }\limits_{{{b}},{{{W}}^{\rm{T}}}{{W}}} ||{{X}} - {{b}}{{\bf{1}}^{\rm{T}}} - {{W}}{{{W}}^{\rm{T}}}\left( {{{X}} - {{b}}{{\bf{1}}^{\rm{T}}}} \right)||_F^2$

 $\mathop {\min }\limits_{{{Q}},{{P}}} ||{{X}} - {{P}}{{{Q}}^{\rm{T}}}{{X}}|{|_{2,1}} + \lambda ||{{Q}}|{|_{2,1}}$

1.2 自步学习(SPL)机制

 $\begin{split} & \min \displaystyle\sum\limits_{i = 1}^n {{v_i}} L({{{y}}_i},g({{{x}}_i},{{w}})) + \alpha \phi \left( {{w}} \right) + \displaystyle\sum\limits_{i = 1}^n {f({{{v}}_i},k)} \\ & {\rm{s}}{\rm{.t}}{\rm{.}}{v_i} \in [0,1],i = 1,2, \cdots ,n \end{split}$

2 自步稀疏最优均值主成分分析模型(SPL-OMSPCA) 2.1 模型建立

SPL-OMSPCA的基本思想是：以 ${L}_{{2},{1}}$ 范数表示损失函数和稀疏约束，并利用自步学习方案对参与训练的样本进行选择。其中，损失函数是训练样本从低维变换空间中恢复原始数据产生的误差，且包含需优化的均值向量。于是有以下优化模型：

 \begin{aligned} \mathop {\min }\limits_{{{v}},{{b}},{{P}},{{Q}}} \displaystyle\sum\limits_{i = 1}^n {{{{v}}_i}} ||{{{x}}_i} - {{b}} - {{P}}{{{Q}}^{\rm{T}}}\left( {{{{x}}_i} - {{b}}} \right)|{|_2} + \alpha ||{{Q}}|{|_{2,1}} +\\ \displaystyle\sum\limits_{i = 1}^n {f({{\rm{v}}_i},k)} \quad\quad\quad\quad\quad\quad\quad\\ {\rm{s}}{\rm{.t}}{\rm{.}}{{{P}}^{\rm{T}}}{{P}} = {{I}},{{{v}}_i} \in [0,1],i = 1,2, \cdots ,n \quad\quad\quad\end{aligned} (2)

 \begin{aligned} \mathop {\min }\limits_{{{v}},{{b}},{{P}},{{Q}}} \displaystyle\sum\limits_{i = 1}^n {{{{v}}_i}} ||{{{x}}_i} - {{b}} - {{P}}{{{Q}}^{\rm{T}}}\left( {{{{x}}_i} - {{b}}} \right)|{|_2} + \alpha ||{{Q}}|{|_{2,1}} + \\ \displaystyle\sum\limits_{i = 1}^n {\frac{{{\beta ^2}}}{{{v_i} + \beta k}}}\quad\quad\quad\quad\quad\quad\quad\quad \\ {\rm{s}}{\rm{.t}}{\rm{.}}{{{P}}^{\rm{T}}}{{P}} = {{I}},{v_i} \in [0,1],i = 1,2, \cdots ,n\quad\quad\quad \end{aligned} (3)

 \begin{aligned} \mathop {\min }\limits_{{{v}},{{b}},{{P}},{{Q}}} ||({{X}} - {{b1}^{\rm{T}}} - {{P}}{{{Q}}^{\rm{T}}}({{X}} - {{b1}^{\rm{T}}}))\sqrt {{V}} \sqrt {{D}} ||_F^2 + \\ \alpha ||\sqrt {{H}} {{Q}}||_F^2 + \displaystyle\sum\limits_{i = 1}^n {\frac{{{\beta ^2}}}{{{v_i} + \beta k}}}\quad\quad\quad\quad\quad \\ {\rm{s}}{\rm{.t}}{\rm{.}}{{{P}}^{\rm{T}}}{{P}} = {{I}},{v_i} \in [0,1],i = 1,2, \cdots ,n\quad\quad\quad \end{aligned} (4)

 ${d_{ii}} = \left\{ {\begin{split} &{{1 / {||{{A}} - {{P}}{{{Q}}^{\rm{T}}}{{A}}{)^i}|{|_2}}}},\quad{||{{({{A}} - {{P}}{{{Q}}^{\rm{T}}}{{A}})}^i}|{|_2} \ne 0} \\ &{1 / \delta },\quad{\text{其他}}\end{split}} \right.$ (5)

 ${h_{ii}} = \left\{ {\begin{split} &{{1 / {||{{{Q}}^j}|{|_2}}}},\quad{||{{{Q}}^j}|{|_2} \ne 0}\\ &{1 / \delta },\quad{\text{其他}} \end{split}} \right.$ (6)

2.2 模型求解

2.2.1 固定 ${{P}}$ ${{Q}}$ ${{b}}$ 求自步学习权重 ${{v}}$

${{P}}$ ${{Q}}$ ${{b}}$ 固定时，式(4)转化为

 $\begin{split} \mathop {\min }\limits_{{{v}},{{b}},{{P}},{{Q}}} \displaystyle\sum\limits_{i = 1}^n {{{{v}}_i}} ||{{{x}}_i} - {{b}} - {{P}}{{{Q}}^{\rm{T}}}\left( {{{{x}}_i} - {{b}}} \right)|{|_2} + \displaystyle\sum\limits_{i = 1}^n {\frac{{{\beta ^2}}}{{{v_i} + \beta k}}} \\ {\rm{s}}{\rm{.t}}{\rm{.}}{{{P}}^{\rm{T}}}{{P}} = {{I}},{v_i} \in [0,1],i = 1,2, \cdots ,n \quad\quad\end{split}$

${L_i} = {\rm{||}}{{{x}}_{\rm{i}}}{\rm{ - }}{{b}}{\rm{ - }}{{P}}{{{Q}}^{\rm{T}}}\left( {{{{x}}_{\rm{i}}}{\rm{ - }}{{b}}} \right){\rm{|}}{{\rm{|}}_2}$ ，则上式可表示为

 \begin{aligned} & \mathop {\min }\limits_{{v}} \displaystyle\sum\limits_{i = 1}^n {{v_i}} {L_i} + \displaystyle\sum\limits_{i = 1}^n {\frac{{{\beta ^2}}}{{{v_i} + \beta k}}} \\ & {\rm{s}}{\rm{.t}}{\rm{.}}{{{v}}_i} \in [0,1],i = 1,2,\cdots,n \end{aligned} (7)

 ${{{v}}_i} = \left\{ {\begin{split} &{\rm{1}},\quad{{L_i} \leqslant 1/{{\left( {k + 1/\beta } \right)}^2}}\\ &{\rm{0}},\quad{{L_i} \geqslant 1/{k^2}}\\ &{\beta \left( {\frac{{\rm{1}}}{\sqrt {{L_i}} }- k} \right)},\quad{\text{其他}} \end{split}}\right.$ (8)
2.2.2 固定 ${{P}}$ ${{Q}}$ ${{v}}$ 求解最优均值 ${{b}}$

 $\mathop {\min }\limits_{{b}} \left\| {({{X}} - {{b1}^{\rm{T}}} - {{P}}{{{Q}}^{\rm{T}}}({{X}} - {{b1}^{\rm{T}}}))\sqrt {{V}} \sqrt {{D}} } \right\|_F^2$ (9)

 $({{I}} - {{P}}{{{Q}}^{\rm{T}}} - {{Q}}{{{P}}^{\rm{T}}} + {{Q}}{{{Q}}^{\rm{T}}})({{b1}^{\rm{T}}} - {{X}}){{VD1}} = 0$ (10)

${{c}} = ({{b1}^{\rm{T}}} - {{X}}){{VD1}}$ ，则式(10)转化为

 $({{I}} - {{P}}{{{Q}}^{\rm{T}}} - {{Q}}{{{P}}^{\rm{T}}} + {{Q}}{{{Q}}^{\rm{T}}}){{c}} = 0$ (11)

${{I}} - {{P}}{{{Q}}^{\rm{T}}} - {{Q}}{{{P}}^{\rm{T}}} + {{Q}}{{{Q}}^{\rm{T}}}$ 为非奇异的，则式(11)的解为 ${{c}} = {{0}}$ ，从而有

 ${{b}} = \frac{{{{XVD1}}}}{{{{\bf{1}}^{\rm{T}}}{{VD1}}}}$ (12)

${{I}} - {{P}}{{{Q}}^{\rm{T}}} - {{Q}}{{{P}}^{\rm{T}}} + {{Q}}{{{Q}}^{\rm{T}}}$ 为奇异的，设其奇异值分解为 ${{{E}}_{\rm{1}}}{{{W}}_{\rm{1}}}{{{U}}_{\rm{1}}}^{\rm{T}}$ , 其中 ${{{W}}_1} = {\rm{diag}}({\sigma _1},{\sigma _2}, \cdots ,{\sigma _s},0, \cdots ,0)$ , ${\sigma _1} \geqslant {\sigma _2} \geqslant \cdots \geqslant {\sigma _s} > 0$ ，则式(11)的解为

 ${{c}} = {{{U}}_{\rm{1}}}{{z}}$

 ${{b}} = \frac{{{{({{{U}}_{\rm{1}}})}_{{{s}} + {\rm{1}}}} + {{XVD1}}}}{{{{\bf{1}}^{\rm{T}}}{{VD1}}}}$ (13)
2.2.3 固定 ${{P}}$ ${{v}}$ ${{b}}$ 求解投影矩阵 ${{Q}}$

 \begin{aligned} \mathop {\min }\limits_{{Q}} \left\| {({{X}} - {{b1}^{\rm{T}}} - {{P}}{{{Q}}^{\rm{T}}}({{X}} - {{b1}^{\rm{T}}}))\sqrt {{V}} \sqrt {{D}} } \right\|_F^2 + \\ \alpha ||\sqrt {{H}} {{Q}}||_F^2\quad \quad \quad \quad \quad \quad \quad \end{aligned} (14)

$\left( {{{X}} - {{b1}^{\rm{T}}}} \right)\sqrt {{V}} \sqrt {{D}} = {{G}}$ ，则式(14)转化为

 $\mathop {\min }\limits_{{Q}} ||{{G}} - {{P}}{{{Q}}^{\rm{T}}}{{G}}||_F^2 + \alpha ||\sqrt {{H}} {{Q}}||_F^2$ (15)

 $\left( {{{G}}{{{G}}^{\rm{T}}} + \alpha {{H}}} \right){{Q}} = {{G}}{{{G}}^{\rm{T}}}{{P}}$

 ${{Q}} = {\left( {{{G}}{{{G}}^{\rm{T}}} + \alpha {{H}}} \right)^{ - 1}}{{G}}{{{G}}^{\rm{T}}}{{P}}$ (16)
2.2.4 固定 ${{Q}}$ ${{v}}$ ${{b}}$ 求解恢复矩阵 ${{P}}$

 $\begin{split} \mathop {\min }\limits_{{P}} \left\|\left( {\left( {{{X}} - {{b1}^{\rm{T}}}} \right) - {{P}}{{{Q}}^{\rm{T}}}\left( {{{X}} - {{b1}^{\rm{T}}}} \right)} \right)\sqrt {{V}} \sqrt {{D}} \right\|_F^2 \\ {\rm{s.t.}}{{{P}}^{\rm{T}}}{{P}} = {{I}}\quad\quad\quad\quad\quad\quad\quad \end{split}$

 \begin{aligned} \mathop {\min }\limits_{{P}} \left\| \sqrt {{D}} \sqrt {{V}} {({{X}} - {{b1}^{\rm{T}}})^{\rm{T}}} - \sqrt {{D}} \sqrt {{V}} {({{X}} - {{b1}^{\rm{T}}})^{\rm{T}}}{{Q}}{{{P}}^{\rm{T}}}\right\|_F^2\\ {\rm{s.t.}}{{{P}}^{\rm{T}}}{{P}} = {{I}} \quad\quad\quad\quad\quad\quad\quad\quad\quad \end{aligned} (17)

 ${{P}} = {{{E}}_2}{{{U}}_2^{\rm{T}}}$ (18)

1)通过式(8)计算 ${{v}}$

2)通过式(12)、(13)计算 ${{b}}$

3)通过式(16)计算 ${{Q}}$

4)通过式(18)计算 ${{P}}$

5)通过式(5)计算 ${{D}}$

6)通过式(6)计算 ${{H}}$

7) 更新 $k$ $k = {k / \mu }$

 $c = {{\rm{argmin}} _{{c_i}}}{\left(\displaystyle\sum\limits_{j = 1}^m {{{(X_i^j - {y_j})}^2}} \right)^{1/2}}$ (19)

3 实验

3.1 数据集

 Download: 图 1 部分含噪数据集的可视化 Fig. 1 Visualization of some datasets with noise
3.2 实验结果及分析

 Download: 图 2 不同算法在6个数据集上不同子空间维度的分类精度 Fig. 2 Classification accuracies of different algorithm in different subspace dimensions on six datasets

 Download: 图 4 不同算法对部分JAFFE数据集的重构图像 Fig. 4 Reconstructed images of partial JAFFE datasets by different algorithms

3.3 参数设置

 ${L_m} = 1/{\left( {k + 1/\beta } \right)^2}$ (20)

 $k=1/{2}\sqrt{L_{m}},\quad \beta ={2}\sqrt{{{L}}_{m}}$ (21)