﻿ 一种基于模糊划分和模糊加权的集成深度信念网络
«上一篇
 文章快速检索 高级检索

 智能系统学报  2019, Vol. 14 Issue (5): 905-914  DOI: 10.11992/tis.201809018 0

### 引用本文

ZHANG Xiongtao, HU Wenjun, WANG Shitong. Ensemble deep belief network based onfuzzy partitioning and fuzzy weighting[J]. CAAI Transactions on Intelligent Systems, 2019, 14(5): 905-914. DOI: 10.11992/tis.201809018.

### 文章历史

1. 江南大学 数字媒体学院，江苏 无锡 214122;
2. 湖州师范学院 信息工程学院，浙江 湖州 313000

Ensemble deep belief network based onfuzzy partitioning and fuzzy weighting
ZHANG Xiongtao 1,2, HU Wenjun 2, WANG Shitong 1
1. School of Digital Media, Jiangnan University, Wuxi 214122, China;
2. School of Information Engineering, Huzhou University, Huzhou 313000, China
Abstract: Aiming at the problems of high training time complexity and easy over-fitting of the deep belief network (DBN) algorithm, inspired by the fuzzy theory, an ensemble deep belief network based on fuzzy partitioning and fuzzy weighting, namely FE-DBN (ensemble deep belief network with fuzzy partition and fuzzy weighting), is proposed to deal with the classification of large-scale data. First, the training data is divided into several subsets by fuzzy clustering algorithm (FCM), and then the DBNs of different structures are trained in parallel on each subset. Finally, the results of each classifier are ensembled by fuzzy weighting. Experiments on artificial datasets and UCI datasets show that the proposed FE-DBN outperforms the DBN in terms of accuracy and running time.
Key words: ensemble    deep belief network    fuzzy partition    fuzzy weighting    running time    fuzzy clustering algorithm (FCM)    fuzzy theory

1 RBM和DBN

RBM是一种能量模型，能量函数定义为

 ${E} ({{v}},{{h}}|{{\theta }}) = - \sum\limits_{i = 1}^n {{b_i}} {v_i} - \sum\limits_{j = 1}^m {{c_j}{h_j} - \sum\limits_{i = 1}^n {\sum\limits_{j = 1}^m {{v_i}{W_{ij}}} } } {h_j}$

 $\left\{ \begin{split} & P ({{v}},{{h}},{{\theta }}) = \frac{{{{\rm e}^{ - E({{v}},{{h}},{{\theta }})}}}}{Z} \\ & Z = \sum\limits_{\tilde v} {\sum\limits_{\tilde h} {{{\rm e}^{ - E(\tilde v,\tilde h,\theta )}}} } \end{split} \right.$ (1)

2 基于模糊划分和模糊加权的DBN分类器集成

2.1 FE-DBN结构图

FE-DBN结构图如图3所示，首先利用模糊聚类算法FCM，将训练数据集划分为K个子集，每个子集分别采用不同结构的DBN模型进行建模(每个DBN子模型中每层隐节点数不一样，由此构成了K个DBN模型)，各模型独立并行训练，最后将各模型所得结果进行模糊加权形成最终输出。在进行模糊加权时，采用高斯型隶属度函数进行权值计算。在FE-DBN中，各DBN子模型并行训练，由于各个训练子集的数据规模远远小于原数据规模，需要较少的隐节点数，因此训练时间较短。

2.2 实现过程

 $\begin{split} & \mathop {\min }\limits_{\mu ,{\rm v}} {\rm{ }}J = \sum\limits_{i = 1}^K {\sum\limits_{j = 1}^N {\mu _{ij}^m({{\left\| {{x_j} - {\upsilon _i}} \right\|}^2})} } \\ & {\rm{ s.t.}}{\rm{ }}\sum\limits_{i = 1}^K {\mu _{ij}^m = 1,\forall j = 1,2, \cdots ,n} \end{split}$

 ${\mu _{ij}} = \dfrac{1}{{\displaystyle\sum\limits_{k = 1}^K {{{\left(\dfrac{{||{x_j} - {\upsilon _i}|{|^2}}}{{||{x_j} - {\upsilon _k}|{|^2}}}\right)}^{1/(m - 1)}}} }}$ (2)
 ${\upsilon _i} = \sum\limits_{j = 1}^N {\mu _{ij}^m} {x_j}/\sum\limits_{j = 1}^N {\mu _{ij}^m} ,{\rm{ 1}} \leqslant i \leqslant K$ (3)

 $\gamma _j^s = \sqrt {\frac{{\sum\limits_{i = 1}^N {\mu _{ij}^m \cdot {{\left\| {x_i^s - \upsilon _j^s} \right\|}^2}} }}{{\sum\limits_{i = 1}^N {\mu _{ij}^m} }}}$

 $\begin{split} & {\vartheta _j} = \left\{ {\left( {{{{x}}_i},{y_i}} \right)\left| {\upsilon _j^s - \xi \cdot \gamma _j^s \leqslant x_i^s \leqslant \upsilon _j^s + \xi \cdot \gamma _j^s} \right.} \right\} \\ & \qquad\; s = 1,2, \cdots ,q;{\rm{ }}j{\rm{ = 1,2,}} \cdots {\rm{,}}K \end{split}$ (4)

 $P({h_j} = 1|{{v}},{{\theta }}) = \sigma ({c_j} + \sum\limits_{i = 1}^n {{v_i}} {W_{ij}})$ (5)

 $P({v_i} = 1|{{h}},{{\theta }}) = \sigma ({b_i} + \sum\limits_{i = 1}^m {{W_{ij}}} {h_j})$ (6)

RBM采用Hinton提出的CD-k(对比散度)算法进行参数学习，并证明，当使用训练样本初始化 ${{{v}}^{\left( 0 \right)}}$ 时，仅需较少的抽样步数(一般k=1)就可以得到很好的近似。采用CD-k算法，各参数的更新准则如下[3]

 $\begin{array}{l} \varDelta {w_{ij}} = \varepsilon ( < {v_i}{h_j}{ > _{\rm data}} - < {v_i}{h_j}{ > _{{\rm recon} }}) \\ \quad \varDelta {b_i} = \varepsilon ( < {v_i}{ > _{{\rm data} }} - < {v_i}{ > _{\rm recon}}) \\ \quad\varDelta {c_j} = \varepsilon ( < {h_j}{ > _{\rm data}} - < {h_j}{ > _{\rm recon}}) \\ \end{array}$ (7)

 ${{{\omega }}_k}({{{x}}_i}) = \omega _k^1({{x}}_i^1) \;\; \omega _k^2({{x}}_i^2) \;\; \cdots \;\; \omega _k^q({{x}}_i^q)$ (8)
 $\begin{array}{l} \omega _k^s({{x}}_i^s) = \max\left\{ {\min \left(\dfrac{{x_i^s - (\upsilon _k^s - \xi \cdot \gamma _k^s)}}{{\upsilon _k^s - (\upsilon _k^s - \xi \cdot \gamma _k^s)}},\dfrac{{(\upsilon _k^s + \xi \cdot \gamma _k^s) - x_i^s}}{{(\upsilon _k^s + \xi \cdot \gamma _k^s) - \upsilon _k^s}}\right),0} \right\} \\ \\ \end{array}$ (9)

 ${\hat y}({{{x}}_{{i}}}) = \frac{{\sum\limits_{k = 1}^K {{{{\omega }}_k}({{{x}}_{{i}}}){\rm{LCM}}_{{\rm{DBN}}}^k({{{x}}_{{i}}})} }}{{\sum\limits_{k = 1}^K {{{{\omega }}_k}({{{x}}_{{i}}})} }}$ (10)

FE-DBN算法实现过程如下：

1)初始化。设定划分子集个数K及重叠因子 $\xi$ ，各子模型DBN的隐节点数及DBN的迭代周期，初始化Wbc的值，学习率 $\varepsilon$

2)划分子集。利用模糊聚类算法FCM求得每簇的中心点和宽度，根据式(4)将源数据集划分为K个子集。

3)并行训练各子模型DBN1~DBNK，对于所有的可见单元，利用式(5)计算 $P({h_j} = 1|{{v}},{{\theta }})$ ，并抽取 ${h_j}$ ${h_j} \in \left\{ {0,1} \right\}$ 对于所有的隐单元，利用式(6)计算 $P({v_i} = 1|{{h}},{{\theta }})$ ，并抽取 ${v_i}$ ${v_i} \in \left\{ {0,1} \right\}$ ，利用式(7)更新RBM参数Wbc的值，即

 ${{W}} = {{W}} + \varDelta {{W}}, \;{{b}} = {{b}} + \varDelta {{b}}, \;{{c}} = {{c}} + \varDelta {{c}}$

4)利用式(8)、式(9)计算每个测试数据对各个子集的隶属度，将测试数据代入3)所得的K个子模型中并输出K个分类结果。利用式(10)进行集成得到最终输出。

3 实验与分析

3.1 实验设置 3.1.1 数据集

3.1.2 参数设置及实验运行环境

3.2 实验结果及分析

3.2.1 人工数据集

3.2.2 UCI数据集