﻿ 基于深度学习的声呐图像目标识别
 舰船科学技术  2020, Vol. 42 Issue (12): 133-136    DOI: 10.3404/j.issn.1672-7649.2020.12.026 PDF

Sonar image target recognition based on deep learning
ZHANG Jia-ming, DING Ying-ying
Jiangsu Automation Research Institute, Lianyungang 222061, China
Abstract: Sonar is the main equipment that used sound waves to detect the ocean. Since its birth, it has been used as the main tool for underwater information detection, positioning and communication. The acquired sonar data displays the target information in the form of images. Due to the influence of the ocean channel and the limitation of the receiving array, the processing of sonar images lacks a completely reliable model method. Deep learning has been widely used in the field of image recognition and target recognition in recent years, based on the main characteristics of sonar images, this paper proposes a sonar image target recognition method based on convolutional neural network. First, use Median filtering to filter the sonar image, and then use Canny edge detection algorithm and Hough transform to detect white lines. Based on the target segmented by the adaptive thresholding image segmentation algorithm, the Kalman filter method is selected to achieve target tracking. Finally, a convolutional neural network is used for classification and recognition of the tracked target. Obtained a high recognition accuracy rate for different sonar image targets.
Key words: image target recognition     median filtering     edge detection     image segmentation     target tracking
0 引　言

1 图像预处理 1.1 中值滤波

1.2 白线检测

 图 1 Canny边缘检测效果示意图 Fig. 1 Schematic diagram of Canny edge detection effect

 $r = {x_0}\cos \theta + {y_0}\sin \theta \text{。}$ (1)

2 图像分割

3 目标跟踪

 $X(k) = AX(k - 1) + BU(k) + W(k)\text{。}$ (2)

 $X(k|k - 1) = AX(k - 1|k - 1) + BU(k)\text{，}$ (3)

 $P(k|k - 1) = AP(k - 1|k - 1){A^{\rm{T}}} + Q\text{，}$ (4)

 $Z(k) = HX(k) + V(k)\text{，}$ (5)

 $X(k|k) = X(k|k - 1) + {K_g}(k)(Z(k) - HX(k|k - 1))\text{，}$ (6)

 ${K_g}(k) = \frac{{P(k|k - 1){H^{\rm{T}}}}}{{HP(k|k - 1){H^{\rm{T}}} + R}}\text{。}$ (7)

 $P(k|k) = (I - {K_g}(k)H)P(k|k - 1)\text{。}$ (8)

 图 2 卡尔曼滤波流程图 Fig. 2 Kalman filter flow chart

4 目标识别

MobileNet是谷歌于2017年提出的用于嵌入式设备和移动端的轻量级卷积神经网络。MobileNet利用深度可分离卷积思想，将传统卷积过程分解为深度卷积Depthwise和点卷积Pointwise两个步骤，突破传统3D卷积的通道数量，通过滤波和组合的方式计算卷积，极大减少了卷积核的冗余，减少了网络层数和参数规模，提高了计算速度。同时引入宽度因子和分辨率因子2个全局超参数对网络延迟和准确度进行有效权衡。

MobileNet_V2的Inverted Residuals模块聚焦于残差网络各层的层数，采用“扩张-深度卷积-压缩”的思想，首先经过 $1\times 1$ 卷积核对输入数据进行特征维数放大，然后经过 $3\times 3$ 深度卷积进行特征提取，最后经过 $1\times 1$ 的点卷积将放大的特征维数压缩回去，解决了特征提取受限于输入通道数的问题。Linear Bottleneck模块将全连接层之后的激活函数由ReLU替换为线性激活函数，而其他层的激活函数依然是ReLU函数，通过将非线性激活变换为线性激活来捕获兴趣流形，解决低维激活空间的信息损失问题。2个模块结合构成了MobileNet_V2网络的基本架构，如图3所示。

 图 3 MobileNet_V2网络微结构 Fig. 3 MobileNet_V2 network microstructure
5 基于卷积神经网络的声呐图像目标识别方法

1）对声呐图像使用中值滤波去除多余噪声，进行Canny边缘检测算法识别白线提取边缘，然后使用霍夫变换提取直线，通过直线交点求解扇形所在圆心；

2）针对预处理后的声呐图像，使用自适应阈值化算法进行图像分割，同时利用分水岭算法连接灰度相近的目标，查找分割图像中的连通区域，获得目标信息；

3）根据提取得到的目标信息，通过匹配算法与卡尔曼滤波器得到的上一帧预测结果进行匹配，根据匹配得到的测量值更新卡尔曼滤波器来实现跟踪目标；

4）将跟踪得到的目标图像数据输入卷积神经网络，自动提取特征进行声呐图像目标识别。

6 结　语

 [1] CERVENKA P, DEMOUSTIER C. Sided-scan sonar image processing techniques[J]. IEEE Journal of Oceanic Engineering, 1993, 18(2): 108-122P. DOI:10.1109/48.219531 [2] CANNY J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1986, 8(6): 679-98. [3] MUKHOPADHYAY P, CHAUDHURI B B. A survey of Hough Transform[J]. Pattern Recognition, 2015, 48(3): 993-1010. DOI:10.1016/j.patcog.2014.08.027 [4] [5] PIKAZ A. A digital image thresholding based on topological stable state[J]. Pattern Recognition, 1996, 29(5): 829-843P. DOI:10.1016/0031-3203(95)00126-3 [6] BELL J. M, PETILLOT Y. R, LEBART K, et al.. Target recognition in synthetic aperture and high resolution sidescan sonar[J]. High Resolution Imaging and Target Classification, 2006, 11: 99-106. [7] ZHANG Xianyu, ZHOU Xinyu, LIN Mengxiao, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[J]. arXiv.org, 2017. [8] MA Ningning, ZHANG Xiangyu, ZHENG Haitao, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[J]. arXiv.org, 2018.