侵蚀沟, 训练样本, 浅层特征, 中层特征, 深层特征, 卷积神经网络
Northeast China is a major commodity grain base in China. The protection of cultivated lands in Northeast China is crucial for safeguarding the food security in China. The recognition of erosion gullies is an important means of monitoring soil erosion. Furthermore, remote sensing technology is extensively used in this field given the multiple advantages of this technology. However, the traditional methods based on remote sensing mostly depend on manual interpretations. Therefore, the degree of the automation and the efficiency are relatively low. In this study, multi-level features were extracted, thereby effectively describing the specific objects by using machine and deep learnings, and erosion gullies were identified based on these features to improve the accuracy and efficiency of recognizing erosion gullies. In this study, we first cut the remote sensing images in a fixed size and labeled these images manually to create datasets as training samples that consist of two categories, namely, farmland and erosion gully. Second, we extracted spectral and textural features based on this dataset as low-level features, encoded SIFT features through ScSPM as middle-level features, and extracted high-level features by using CNN. Third, a linear SVM and a softmax classifier were applied to classify the remote sensing images based on the multi-level features to identify the images with erosion gullies. Finally, we completed a set of methods to extract the feature and recognize the erosion gully, thereby providing a robust support for protecting arable land in the black soil area of Northeast China. The multi-level features extracted through the proposed method demonstrate specific capabilities in identifying erosion gully images. In the test phase, results show that the recognition outcome based on low-level features exhibits the lowest accuracy (91.1%), whereas the recognition accuracy based on middle-level features is the highest (98.5%). However, both features require a manual design. Hence, the degree of automation is limited. By contrast, the CNN can extract high-level features and automatically achieve an " end-to-end” learning, which highly improves the degree of automation of erosion gully recognition. Furthermore, the recognition accuracy based on high-level features is 95.5%, which satisfies the expectation of this study. The recognition accuracy is slightly lower in the validation phase than in the test phase because the images typically contain several irrelevant objects in the practical application, thereby preventing the improvement of the accuracy. However, the proposed method can generally identify the erosion gullies in the images with a reasonable practicality. Low-level features demonstrate several advantages, such as simple calculation and low time consumed. However, the capability to describe the erosion gully is relatively poor, thus resulting in low recognition accuracy. By contrast, the methods based on middle- and high-level features can identify nearly all the erosion gullies in the images, although these methods are time-consuming during the early training phase. Specifically, the method based on high-level features can automatically recognize the erosion gully. This study shows that deep learning has a great potential in remote sensing image application. If the sample size is continuously increased and the network structure expanded, then the recognition accuracy of erosion gully can be further improved.
Key words
erosion gullies, training samples, low-level features, middle-level features, high-level features, convolution neural network
1 引 言
耕地是人类赖以生存和发展的基础,耕地保护一直以来都是中国土地资源管理的核心(李宪文和林培,2001)。东北黑土区是中国最大的商品粮产区,其粮食生产能力和农业可持续性关系到国家的粮食安全战略(程叶青和张平宇,2005)。然而经过长期开垦,黑土区坡耕地的土壤受到了侵蚀,造成了严重的水土流失现象,尤其是侵蚀沟发展迅速,在全国范围内实属罕见,引起了广泛关注。根据侵蚀沟的发育阶段,可将其分为细沟、切沟、冲沟和坳沟(范昊明 等,2007)。细沟一般宽0.5 m,深0.1—0.4 m,长可达数米;切沟已有明显的边缘,宽、深均可达1—2 m,沟口处开始形成陡坎;冲沟沟口已形成明显的陡坎,沟坡处经常发生崩塌、滑坡,致使沟槽不断加宽,深达几米至几十米,长可达几百米;坳沟浅而宽,沟底由大量碎屑物填充。侵蚀沟的识别是土壤侵蚀监测工作中的一个关键问题,对坡耕地保护和水土流失防治具有重要意义,引起人们的广泛关注。
兴起于20世纪60年代的遥感技术,因具有大范围、多时相、低成本的优势,在侵蚀沟识别中应用广泛(闫业超 等, 2005, 2007),但传统方法多依赖人工解译,自动化程度不高,方法效率较低(Mcinnes 等,2011)。为不断提高侵蚀沟识别的效率和精度,众多方法被相继提出。其中,基于像元的分类方法较为普遍(Metternicht和Zinck,1998),但由于单一像元会丢失很多信息,在识别分类中会有很大的局限(Blaschke和Strobl,2001)。相比之下,基于对象的分类方法利用一个对象的光谱信息、几何信息、纹理信息等,可大大提高分类精度(Shruthi 等,2011)。然而随着遥感影像分辨率的不断提高,一幅遥感影像中包含多种场景的语义信息,基于场景的分类方法可以更好的揭示其空间特征和结构特征,对图像的解译能力更好(Chen 等,2011)。
对不同场景进行分类识别需要进行特征的选取。图像的特征可分为浅层特征、中层特征和深层特征(Xia 等,2016)。其中,浅层特征是描述图像的最基本特征,提取方法简单,复杂度较低;中层特征通常是对浅层特征进行编码得来,相比用于描述局部特征的浅层特征,中层特征能更好地描述图像的全局特征;而深层特征是深度学习模仿生物神经系统在处理信息时的分级特点,由浅到深逐步抽象而来,相比于浅层特征和中层特征,深层特征更为抽象,层次越深对数据的抽象程度越高,也越能反应数据的本质。浅层特征包括光谱特征、纹理特征、结构特征、SIFT(Scale Invariant Feature Transform)特征(Lowe,2004)等,通常把多个浅层特征组合起来可以优化分类结果,Luo等人(2013)将6种浅层特征组合成多特征来进行多分辨率遥感影像的分类检索,结果表明多个特征的结合能更好的描述遥感影像。中层特征的编码方法中,最为人熟知的编码方法是词袋模型(Yang和Newsam,2010),但词袋模型忽视了图像局部的空间分布特征,SPM(Spatial Pyramid Matching)方法(Lazebnik 等,2006)可以很好的弥补这一不足,而ScSPM方法(Yu,2013)用稀疏编码代替了SPM中的k-means算法,进一步提高了图像识别的精度。提取深层特征的深度学习方法中,最具代表性的是卷积神经网络CNN(Convolutional Neural Network),比较典型的有AlexNet(Krizhevsky 等,2012),VGGNet(Simonyan和Zisserman,2014),GoogLeNet(Szegedy 等,2015)等,均在图像识别领域取得了出色的成果。
2 方法原理
2.1 样本选取
在图像识别中,其结果的好坏依赖于样本的质量和数量。本文针对东北黑土区坡耕地中的侵蚀沟这一特定识别目标,对训练样本进行了人工判别和选取。侵蚀沟包括细沟、切沟、冲沟和坳沟,其中,细沟的宽度约0.5 m,即便在高分辨率遥感影像上也较难分辨,而坳沟的形态尺度过大,在小尺度的影像中难以显示全貌。因此受到遥感影像分辨率和图像样本尺度的限制,本文所研究的侵蚀沟主要为冲沟和切沟。
2.2 多层次特征提取
2.2.1 浅层特征提取
浅层特征选取的是光谱特征和纹理特征,为突出浅层特征计算量少,提取难度低的优势,光谱特征选择了两个基本的统计量:波段均值(μ)和波段标准差(σ);纹理特征通过计算图像的灰度共生矩阵(Haralick 等,1973)来获得,选取了对比度(CT)、相关性(CR)、能量(E)和同质度(H)。浅层特征向量F定义为
${{F}} = \left( {\mu ,\sigma ,CT,CR,E,H} \right)$ | (1) |
$\mu = \frac{1}{N}\mathop \sum \limits_{i,j}^N p\left( {i,j} \right)$ | (2) |
$\sigma = \sqrt {\frac{1}{{N - 1}}\mathop \sum \limits_{i,j}^N {{\left| {p\left( {i,j} \right) - \mu } \right|}^2}} $ | (3) |
$CT = \mathop \sum \limits_{i,j} {\left| {i - j} \right|^2}p\left( {i,j} \right)$ | (4) |
$CR = \frac{\displaystyle{\mathop \sum \limits_{i,j} \left( {i - {\mu _i}} \right)\left( {j - {\mu _j}} \right)p\left( {i,j} \right)}}{{{\sigma _i}{\sigma _j}}}$ | (5) |
$E = \mathop \sum \limits_{i,j} {\left\{ {p\left( {i,j} \right)} \right\}^2}$ | (6) |
$H = \mathop \sum \limits_{i,j} \frac{1}{{1 + \left| {i - j} \right|}}p\left( {i,j} \right)$ | (7) |
式中,N是像元数,p(i, j)是(i, j)处的归一化值。
2.2.2 中层特征提取
2.2.3 深层特征提取
2.3 分类识别
识别精度总体分类精度OA(Overall Accuracy)表示,即所有样本中被正确分类样本的比例:
$OA = \mathop \sum \limits_{i = 1}^m \frac{{{x_i}}}{M}$ | (8) |
3 实验结果与分析
3.1 实验数据及样本选取
3.1.1 实验数据
从形态上分析发现东北黑土区的侵蚀沟宽度一般较小,尤其在发育初期,宽度不足1 m,在低分辨率遥感影像上难以分辨,因此样本选取对遥感影像的空间分辨率要求较高。此外,耕地作为一类具有季节变化特征的地物,其特征也具有时间序列的变化,如有庄稼覆盖的耕地和收割后的裸露耕地在其特征上差异较大,所以应选取多时相的遥感影像,避免样本的单一性。
Google Earth遥感影像为RGB彩色图像,因其获取便利、具备高空间分辨率等优势在遥感领域应用广泛。综合考虑,选定东北黑土区多时相空间分辨率为0.4 m的Google Earth影像作为实验数据。
3.2 特征提取和网络设计
3.2.1 浅层特征
${{{F}}_{{\rm{rgb}}}} = \left( {{{{\mu}} _{{\rm{rgb}}}},{{{\sigma}} _{{\rm{rgb}}}},{{{CT}}_{{\rm{rgb}}}},{{{CR}}_{{\rm{rgb}}}},{{{E}}_{{\rm{rgb}}}},{{{H}}_{{\rm{rgb}}}}} \right)$ |
表 1 浅层特征
Table 1 Low-level features

特征 | 名称 | 数量 | |
光谱特征 | 均值 | μr,μg,μb | 3 |
标准差 | σr,σg,σb | 3 | |
纹理特征 | 对比度 | CTr,CTg,CTb | 3 |
相关性 | CRr,CRg,CRb | 3 | |
能量 | Er,Eg,Eb | 3 | |
同质度 | Hr,Hg,Hb | 3 |
表 2 浅层特征统计量
Table 2 Statistics of low-level features

μr | μg | μb | σr | σg | σb | |
均值1 | 78.19 | 94.76 | 99.32 | 13.07 | 12.65 | 9.89 |
均值2 | 75.08 | 91.58 | 95.36 | 18.04 | 16.99 | 14.48 |
差值 | 3.11 | 3.18 | 3.96 | –4.97 | –4.36 | –4.59 |
标准差1 | 33.56 | 24.25 | 18.98 | 6.73 | 5.62 | 4.28 |
标准差2 | 16.89 | 12.26 | 11.21 | 7.47 | 7.05 | 6.55 |
差值 | 16.67 | 11.99 | 7.77 | –0.74 | –1.43 | –2.27 |
CTr | CTg | CTb | CRr | CRg | CRb | |
均值1 | 0.12 | 0.09 | 0.12 | 0.72 | 0.71 | 0.63 |
均值2 | 0.14 | 0.13 | 0.14 | 0.82 | 0.80 | 0.75 |
差值 | –0.02 | –0.04 | –0.02 | –0.10 | –0.09 | –0.12 |
标准差1 | 0.05 | 0.05 | 0.06 | 0.13 | 0.15 | 0.12 |
标准差2 | 0.05 | 0.06 | 0.07 | 0.06 | 0.07 | 0.07 |
差值 | 0 | –0.01 | –0.01 | 0.07 | 0.08 | 0.05 |
Er | Eg | Eb | Hr | Hg | Hb | |
均值1 | 0.49 | 0.59 | 0.55 | 0.94 | 0.95 | 0.94 |
均值2 | 0.39 | 0.47 | 0.48 | 0.93 | 0.94 | 0.93 |
差值 | 0.10 | 0.12 | 0.07 | 0.01 | 0.01 | 0.01 |
标准差1 | 0.19 | 0.20 | 0.21 | 0.02 | 0.02 | 0.03 |
标准差2 | 0.14 | 0.17 | 0.20 | 0.02 | 0.03 | 0.03 |
差值 | 0.05 | 0.03 | 0.01 | 0 | –0.01 | 0 |
3.2.2 中层特征
中层特征的提取是在Matlab中编程实现,首先提取图像的SIFT特征,计算用时10 min;然后将SIFT特征进行稀疏编码操作,该过程耗时较长,用时约1953 min;再经过最大池化,最终每幅图像都生成一个21504维的向量,即为中层特征。
3.2.3 深层特征
本文设计了一个5层的卷积神经网络来提取并保存深层特征:输入图像的大小为256×256×3;网络的第1层C1由32个11×11×3大小的卷积核组成,由ReLU函数激活,再经max pooling输出;第2层C2由96个5×5×32大小的卷积核组成,由ReLU激活,再经max pooling输出;第3层C3由128个3×3×96大小的卷积核组成,由ReLU函数激活,再经max pooling输出;第4层F1是全连接层,有2048个单元;最后的输出层为Softmax层,输出为[1, 0]或[0, 1],分别代表耕地和侵蚀沟两种类别。网络结构示意图如图5所示。该网络的是在Tensorflow框架下搭建的,在一个GPU上进行训练,网络训练次数为50000次,训练结束时网络已收敛,此过程耗时约450 min。图6展示了两类样本经过卷积层输出的一部分特征图像,可以看出CNN能够提取到侵蚀沟的显著特征,与耕地的特征存在明显区别。
3.3 分类识别结果分析
3.3.1 基于浅层特征的识别结果分析
3.3.2 基于中层特征的识别结果分析
3.3.3 基于深层特征的识别结果分析
表 3 基于多层次特征的识别精度
Table 3 The recognition accuracy based on multi-level features

特征 | 测试样本(10%) | 测试样本(20%) | 测试样本(30%) | |
层特征 |
光谱特征 | 0.887(±0.04) | 0.877(±0.03) | 0.883(±0.04) |
纹理特征 | 0.864(±0.05) | 0.859(±0.04) | 0.862(±0.03) | |
光谱+纹理特征 | 0.909(±0.04) | 0.911(±0.03) | 0.902(±0.02) | |
基于中层特征 | 0.968(±0.01) | 0.973(±0.01) | 0.985(±0.02) | |
基于深层特征 | 0.910(±0.01) | 0.955(±0.05) | 0.902(±0.04) |
3.3.4 实例分析
4 结 论
(1) 3种方法相对于测试阶段,识别精度均有所降低,原因在于选取的训练样本纯净度很高,而实际应用中会存在其他地物的干扰,如包含田间小路的耕地容易被误分为侵蚀沟,在一定程度上限制了识别的精度。这也反映了识别精度的提高依赖于训练样本的大容量和多样化。
(2) 在利用深度学习方法进行侵蚀沟影像识别的过程中,发现该方法具有一定的优势,但同时也面临着一些问题。在深度学习中,网络层数越多,得到的特征越抽象,描述图像的能力越强,但多层网络需要大量的样本来训练,在本文的研究中存在小样本的问题,因此选择了较少层数的CNN网络。但同时也说明,深度学习在遥感影像识别中还具有非常大的潜力,若继续增加样本容量,扩大网络结构,能够获得更深层次的特征,对侵蚀沟影像的识别精度也可获得进一步提高。
(3) 本文使用的Google Earth遥感影像包含R、G、B3个波段,在此基础上取得了较为理想的结果。由此可将该方法推广至其他遥感影像的应用中,增加波段数量,扩充特征维数,预期能获得更高的识别精度。
