 自动化学报  2018, Vol. 44 Issue (5): 804-810

ZHANG Long, ZHAO Jie-Yu, YE Xu-Lun, DONG Wei
Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211
Manuscript received : September 7, 2017, accepted: February 7, 2018.
Foundation Item: Supported by National Natural Science Foundation of China (61571247), National Natural Science Foundation of Zhejiang Province (LZ16F030001), and International Cooperation Projects of Zhejiang Province (2013C24027)
Corresponding author. ZHAO Jie-Yu  Professor at Ningbo University. His research interest covers computer image processing, machine learning, and neural net-works. Corresponding author of this paper
Abstract: Generative adversarial nets (GANs) combine the generative model with the discriminative model. With unsupervised training methods, the two types of models mutually improve through the adversarial process. It sets off a new machine learning boom in academia. The final goal of GANs learning is to fit any real-world data distribution. In practice, however, the real-world data distribution is difficult to estimate. The major problem is mode collapse, which may lead to redundancy and non-convergence. To improve the unsupervised generator and eliminate the risk of mode collapse, this paper proposes a novel co-operative network structure for GANs. Multiple generative models are constructed with a co-operative mechanism. It can help generative models to work together and learn from each other during training. In this way, the fitting ability of generators is largely enhanced, furthermore, the quality of generated data is eventually upgraded. Experiments are conducted on three different types of benchmark datasets. Results show that the new model significantly improves image generation, especially for human face pictures. Additionally, the co-operative mechanism can speed up the convergence, improve network's learning efficiency and deduct loss function noise. It also plays a certain role in 3D model generation and suppress the problem of mode collapse. In order to solve the inconsistency between generation model and discriminative model, a dynamic learning method is developed which can dynamically adjust learning frequency. It ultimately reduces unnecessary gradient penalties.
Key words: Generative adversarial nets (GANs)     co-operative     mode collapse     generative model     unsupervised learning

1 相关工作

2 协作式生成对抗网络 2.1 生成对抗网络

3 实验结果

3.1 MNIST手写体

MNIST手写体数字数据集包含从0到9的10类共7万个手写体数字图片[23-24].训练结果如图 4所示.由于协作因子的介入, 对初期的训练结果产生了干扰, 但在迭代1 000次之后, 协作式生成对抗网络逐步超越了传统生成对抗网络, 并在迭代2 000次后开始收敛, 验证了本文的网络结构不仅能够增强图像生成质量, 也能提高模型收敛速度.

 图 4 MNIST手写体数据集训练结果(上层采用标准生成对抗网络, 下层采用协作式生成对抗网络) Figure 4 Training results on MNIST handwritten digits dataset (upper layer implements standard GANs, lower layer implements co-operative GANs
3.2 CelebA人脸

CelebA数据集包含202 599张姿态不同、背景杂乱的人脸照片[25-26].我们构建了一个生成器与判别器都是5层的深度卷积生成对抗网络(Deep convolutional generative adversarial nets, DCGAN)[9], 输入是一个100维的向量, 随机采样于均匀分布.每层卷积模板的数量分别为1 024, 512, 256, 128, 3, 卷积核大小为4 $\times$ 4, 步长为2, 生成器的输出为分辨率64 $\times$ 64的人脸图片.训练过程中mini-batch设置为64, 一个回合共3 166个batch.

 图 5 CelebA人脸数据集训练结果(左侧为深度卷积生成对抗网络, 右侧为协作式生成对抗网络, (a)迭代500次; (b)迭代1 000次; (c) $\sim$ (h)迭代1 $\sim$ 6回合) Figure 5 Training results on CelebA human faces dataset (left side is trained by DCGAN, right side is trained by ours after, (a) 500 iterations; (b) 1 000 iterations; (c) $\sim$ (h) 1 $\sim$ 6 epochs)

 图 6 CelebA数据集生成结果对比 Figure 6 Comparison of synthetic data with CelebA dataset

 图 7 判别与生成模型的损失函数值变换情况 Figure 7 Loss value changes of discriminator and generator models
3.3 ModelNet40三维模型

ModelNet[27-28]是三维领域知名的大型数据集, 它包含127 915个CAD三维模型. ModelNet40是其子集, 包含12 312个标定类别的三维模型, 分为40个类.为了验证新模型在三维物体生成上同样适用, 首先, 将ModelNet40中的三维网格模型进行了体素化操作; 然后对第3.2节中的网络结构进行修改, 使其能够处理三维体素数据, 具体参数参照了3DGAN[29], 输入为一个200维向量, 随机采样于均匀分布, 生成器输出为64 $\times$ 64 $\times$ 64的三维体素模型, mini-batch定义为5 (数字越小效果越好, 训练速度也相对较慢).

 图 8 协作式生成对抗网络在ModelNet40数据集的训练结果 Figure 8 Results by co-operative GANs on ModelNet40 dataset
4 总结