«上一篇
 文章快速检索 高级检索

 智能系统学报  2018, Vol. 13 Issue (3): 388-394  DOI: 10.11992/tis.201612040 0

### 引用本文

WANG Hongxiang, LIU Peizhong, LUO Yanmin, et al. Convolutional neutral network tracking algorithm accelerated by Gaussian kernel function[J]. CAAI Transactions on Intelligent Systems, 2018, 13(3): 388-394. DOI: 10.11992/tis.201612040.

### 文章历史

1. 华侨大学 工学院，福建 泉州 362021;
2. 华侨大学 计算机科学与技术学院，福建 厦门 361021

Convolutional neutral network tracking algorithm accelerated by Gaussian kernel function
WANG Hongxiang1, LIU Peizhong1, LUO Yanmin2, DU Yongzhao1, CHEN Zhi1
1. College of Engineering, Huaqiao University, Quanzhou 362021, China;
2. College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
Abstract: In view of such defects existing in the depth learning tracking algorithm as lack of training samples, large time consumption, and high complexity, this paper proposed a simplified convolutional neural network tracking algorithm in which training is unnecessary. Moreover, the Gaussian kernel function can be applied to this algorithm to significantly lower the computing time. Firstly, the initial frame target was normalized and clustered to extract a series of initial filter banks; in the tracking process, the background information of the target and the candidate target for the foreground were convoluted; then the simple and abstract features of the target were extracted; finally, all the convolutions of a simple layer were superposed to form a deep-level feature representation. The Gaussian kernel function was used to speed-up the convolution operations; also, the local structural feature information of the target was used to update the filters in every stage of the network; in addition, the tracking was realized by combining the particle filter tracking framework. The experimental results on the CVPR2013 tracking datasets show that the method used in this paper can help avoid the typically cumbersome operational environment of deep learning, overcome local object occlusion and deformation at low resolution, and improve tracking efficiency under a complex background.
Key words: visual tracking    deep learning    convolutional neural network (CNN)    gauss kernel function    foreground object    background information    template matching    particle filter

1)目标表观建模

2)跟踪策略

1 相关工作

2013年以来，深度学习算法在跟踪领域已经取得了很大进展。如深度神经网络、卷积神经网络等深度学习方法能够挖掘出数据的多层表征，而高层级的表征更能够反映数据更深层的本质，相比传统浅层学习特征，基于高层次特征的跟踪算法可以提高目标的跟踪效率[16]

1.1 CNN特征提取结构

1.2 基于深度学习的跟踪算法

2 高斯核函数卷积神经网络跟踪算法

2.1 核函数卷积

 ${k({{x,\,x'}})} = \exp ( - \frac{1}{{{{{\sigma}} ^2}}}({\left\| {{x}} \right\|^2} + {\left\| {{{x}}'} \right\|^2} - 2{F^{ - 1}}(\sum\limits_d {{{{\hat{ x}}}^*}} \odot {\hat{ x}}')))$ (1)

 ${{\alpha }} = {({{K}} + \lambda {{I}})^{ - 1}}{{y}}$ (2)

 ${{\hat{ \alpha }}^*} = {\hat{ y}} \times {({{\hat{ k}}^{xx'}} + \lambda )^{ - 1}}$ (3)

2.2 特征提取

 ${{S}}_i^o = {{F}}_i^o \otimes {{I}},\,\,{{S}}_i^o \in {R^{{{(n - w + 1)}^2}}}$ (4)

 ${F_l} = \left\{ {{{F}}_1^b,{{F}}_2^b, \cdots ,{{F}}_l^b} \right\}$ (5)

 ${{{S}}_i} = {{S}}_i^o - {{S}}_i^b = ({{F}}_i^o - {{F}}_i^b) \otimes {{I}},\,\,i \in \{ 1,2, \cdots ,d\}$ (6)

 ${{C}} \in {R^{(n - w + 1) \times (n - w + 1) \times d}}$ (7)

2.3 粒子滤波

 $\begin{array}{c} p({{{S}}_t}|{{{Z}}_t}) \propto \\ p({{{Z}}_t}|{{{S}}_t})\int {p({{{S}}_t}|{{{S}}_{t - 1}})p({{{S}}_{t - 1}}|{{{Z}}_{t - 1}}){\text{d}}{{{S}}_{t - 1}}} \\ \end{array}$ (8)

 $p({{{S}}_t}|{{{S}}_{t - 1}}) = N({{{S}}_t}|{{{S}}_{t - 1}},\sum )$ (9)

 $p({{{Z}}_t}|{{S}}_t^i) \propto {{\text{e}}^{ - \left| {{\text{vec}}({{{C}}_t}) - {\text{vec}}({{C}}_t^i)} \right|_2^1}}$ (10)

 ${{\hat{ S}}_t} = \arg {\max _{\{ {{S}}_t^i\} _{i = 1}^{\text{N}}}}p({{{Z}}_t}|{{S}}_t^i)p({{S}}_t^i)$ (11)
2.4 跟踪算法

1)输入：输入视频序列，并给定跟踪目标。

2)初始化：归一化，粒子滤波，网络规模，样本容量等参数设置。

3)初始滤波器提取：利用第一帧的目标，通过滑动窗口和K-means聚类提取一个初始滤波器组用作后续网络的滤波器使用。

4)卷积特征提取：利用上文的卷积网络结构提取出各候选样本的深层抽象特征，并使用高斯核函数进行加速。

5)粒子滤波：按照粒子滤波算法，归一化后生成规定尺寸大小的候选图片样本集，并进行目标识别与匹配。

6)网络更新：采取限定阈值的方式，即当所有粒子中最高的置信值低于阈值时，认为目标特征发生较大表观变化，当前网络已无法适应，需要进行更新。利用初始滤波器组，结合跟踪过程中得到前景滤波器组，进行加权平均，得到全新的卷积网络滤波器。

7)模板更新：以第一帧中目标的中心点为中心，偏移量为±1个像素点范围内进行等尺寸采样，构成正样本集合。以当前帧目标的远近两类距离采样，构成负样本集合。跟踪过程中为了减轻漂移现象，预设一个更新阈值f=5，目标模板每5帧更新一次。

3 实验结果与分析

3.1 定性分析

 Download: 图 3 视频序列跟踪结果示例 Fig. 3 Examples of the tracking results on video sequences
3.2 定量分析