 智能系统学报  2019, Vol. 14 Issue (6): 1144-1151  DOI: 10.11992/tis.201905041

SHAN Yi, YANG Jinfu, WU Suishuo, et al. Skip feature pyramid network with a global receptive field for small object detection[J]. CAAI Transactions on Intelligent Systems, 2019, 14(6): 1144-1151. DOI: 10.11992/tis.201905041.

1. 北京工业大学 信息学部，北京 100124;
2. 计算智能与智能系统北京重点实验室，北京 100124

Skip feature pyramid network with a global receptive field for small object detection
SHAN Yi 1,2, YANG Jinfu 1,2, WU Suishuo 1,2, XU Bingbing 1,2
1. Beijing University of Technology, Faculty of Information Technology, Beijing 100124, China;
2. Beijing Key Laboratory of Computational Intelligence and Intelligence System, Beijing 100124, China
Abstract: With the development of deep learning, objects can be detected with high accuracy and efficiency. However, the detection of small objects remains challenging. The main reason for this is that the relationship between high-level semantic information and low-level feature maps is not fully utilized. To solve this problem, we propose a novel detection framework, called the skip feature pyramid network with a global receptive field, to improve the ability to detect small objects. Unlike previous detection architectures, the skip feature pyramid architecture fuses high-level semantic information with low-level feature maps to obtain detailed information. To extract global information from a network, we apply a global receptive field (GRF) with convolution kernels of different sizes and different dilated convolution steps. The experimental results on PASCAL VOC and MS COCO datasets show that the proposed approach realizes significant improvements over other comparable detection models.
Key words: skip feature pyramid network    global receptive field    object detection    deep learning    feature extraction    convolutional neural network    dilated convolution    image processing

1 算法模型

 Download: 图 1 基于跳跃连接金字塔的小目标检测模型 Fig. 1 Title Skip feature pyramid network with global receptive field for object detection

1.1 跳跃连接金字塔

 $o = [\frac{{i - f + 2p}}{s}] + 1$ (1)

 Download: 图 3 跳跃连接的金字塔的细节结构 Fig. 3 The detailed structure of skip feature pyramid
1.2 全局感受野模块

 Download: 图 4 全局感受野结构 Fig. 4 The network of global receptive field
1.3 包围框的设置

1.4 损失函数

 $\begin{gathered} \!\!\!\!\!\!\!\!\! {{L(\{ }}{{{p}}_i}{\rm{\} ,\{ }}{{{x}}_i}{\rm{\} ,\{ }}{{{c}}_i}{\rm{\} ,\{ }}{{{t}}_i}{\rm{\} ) = }}\frac{1}{{{N_{\rm conv}}}} {\left( \sum\limits_i {l_b}({p_i},[l_i^* \geqslant 1] \right) + \sum\limits_i {[l_i^* \geqslant 1]} } \\[-1pt] \!\!\!\!\!\!\!\!\!\!\!\!\!\!\! {l_r}({x_i},g_i^*)) + \frac{1}{{{N_p}}}\left(\sum\limits_i {{l_m}} ({c_i},l_i^*) + \sum\limits_i {[l_i^* \geqslant 1]} {l_r}({t_i},g_i^*) \right) \\ \end{gathered} \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!$ (2)

 ${l_r}(x,{g^*},{l^*}) = \sum\limits_{i \in {\rm Pos}}^N {\sum\limits_{m \in \{ cx,cy,w,h\} } {[{l^*} \geqslant 1]} {\rm smoot{h_{L1}}}(x_i^m - \widehat g_j^m)}$ (3)
 $\widehat g_j^{cx} = (g_j^{cx} - d_i^{cx})/d_i^w{\text{，}}\;\widehat g_j^{cy} = (g_j^{cy} - d_i^{cy})/d_i^h$ (4)
 $\widehat g_j^w = \log (\frac{{g_j^w}}{{d_i^w}}){\text{，}}\;\widehat g_j^h = \log (\frac{{g_j^h}}{{d_i^h}})$ (5)

2 实验结果及分析

2.1 PASCAL VOC

 Download: 图 5 在VOC2007上可视化的实验结果对比 Fig. 5 The visual comparison of experimental results on VOC2007 test
2.2 MS COCO

3 结束语

