一种基于引导滤波的快速密集匹配方法

引用本文

许殊, 谭爱红, 秦梓杰. 一种基于引导滤波的快速密集匹配方法[J]. 测绘地理信息, 2022, 47(3): 123-127. DOI:10.14188/j.2095-6045.2019163

XU Shu, TAN Aihong, QIN Zijie. A Fast Dense Matching Method Based on Guided Filter[J]. Journal of Geomatics 2022, 47(3): 123-127. DOI:10.14188/j.2095-6045.2019163

一种基于引导滤波的快速密集匹配方法

[PDF全文]

许殊^1,2, 谭爱红³, 秦梓杰¹

1. 武汉大学遥感信息工程学院，湖北武汉，430079;
2. 中国科学院空天信息创新研究院，北京，100094;
3. 扬州市职业大学机械工程学院，江苏扬州，225009

收稿日期: 2020-04-30

基金项目: 中国科学院战略性先导科技专项（A类）（XDA19000000）

第一作者: 许殊，硕士生，研究方向为航空影像密集匹配。E-mail：XuShu@whu.edu.cn

摘要: 提出了一种快速并在深度不连续处表现良好的密集匹配算法。首先利用稀疏匹配结果和可靠匹配的格网点推估影像上的视差分布情况，获得候选视差序列；然后逐像素为每个候选视差计算Census代价，并运用引导滤波进行代价聚合；最后为每个像素从所有可能的视差中选择对应最小匹配代价的视差作为最终视差。结果表明，该方法在局部匹配方法中表现优秀，匹配质量与半全局匹配类方法、使用图割优化的全局匹配方法大体相当，但处理速度明显更快。

关键词: 密集匹配快速引导滤波 Census测度航空影像深度不连续

A Fast Dense Matching Method Based on Guided Filter

XU Shu^1,2, TAN Aihong³, QIN Zijie¹

1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China;
2. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China;
3. School of Mechanical Engineering, Yangzhou Polytechnic College, Yangzhou 225009, China

Abstract: We present a fast dense matching algorithm with good performance in depth-discontinuity areas.Firstly, the sparse matching results and reliable matching grid points are used to infer the distribution of disparity and obtain the candidate disparity sequence.Secondly, we calculate Census cost for each candidate disparity pixel by pixel and use guided filter to aggregate costs.Thirdly, we select the disparity corresponding to the lowest cost as the final disparity for each pixel.The results show that, this algorithm has a better performance among local matching methods.Its matching quality is similar to that of semi-global matching method and the global matching method optimized by graph cut, but its processing speed is decidedly faster.

Key words: dense matching fast guided filter Census cost aerial image depth discontinuity

航空影像密集匹配旨在为由原始影像及经检校和空三获得的内外方位元素生成的核线像对重叠区域中每个像素确定同名点。其结果可用于数字表面模型（digital surface model，DSM）、数字正射影像图（digital orthophoto map，DOM）等传统测绘产品的生产。此外，相较于激光雷达（light detection and ranging，LiDAR），密集匹配生成的点云具有色彩、纹理，点云密度高，是三维城市重建的重要数据源^[1]。

密集匹配通常有代价计算、代价聚合、视差选择或代价函数优化、视差精化4个步骤^[2]。依据光滑假设的使用方式，密集匹配方法分为局部方法、全局方法。一般全局方法速度慢，但效果优于局部方法^[3]。

局部立体匹配方法一般在代价聚合阶段隐式使用光滑假设。由于已经隐式考虑了光滑约束，其优化过程退化为赢家通吃（winner take all，WTA）^[4]。针对原始匹配代价，局部立体匹配方法假定围绕待匹配点的某种窗口中的视差满足一定条件，选择相应滤波手段，消除可能存在的一些噪声。具体方法主要分为两大类^[5]：聚合窗口的选择方式、聚合窗口内权重确定方式。对于矩形聚合窗口，可以通过变换窗口大小或者偏移^{[6, 7]}、使用多窗口^[8]、自由变换聚合窗口形状^[9-11]来实现聚合窗口的选择。值得一提的是PatchMatch算法^[12]，首先为每个像素点随机生成初始法向量；然后利用视、空间、时序等多种传播过程优化每个像素点对应的法向量；最后进行视差、法向量精化。最小二乘匹配^[13]在密集匹配中的应用解决了最小二乘初值问题，并在后续优化中使用了稍有差别的方法。PatchMatch衍生出许多改进方法^[14-17]。另一大类方法通过改变聚合窗口内每个像素的权重来提升聚合效果，其本质是赋予不满足窗口所使用假设的像素更低的权重^[18]。其首见于文献[19]，它通过双边滤波的方式确定窗口内权重，即依据距离中心像素的距离和与中心像素颜色距离来确定权重。此外也有测地线距离定权^[20]、引导滤波定权^[21]、利用空间通道可靠性定权^[22]等方法。利用多尺度信息^[23-25]、分割信息^[26]辅助匹配也取得了相当不错的效果。航空影像对应的地面结构相对复杂，幅面一般较大。为更好地处理大像幅航空影像，并在深度不连续处获得较好的结果，本文提出了一种改进的基于引导滤波的快速密集匹配方法——基于Census测度的快速引导滤波匹配（fast matching with Census cost and guided filter，FCG），旨在提高匹配效率和匹配正确性。

1 基于引导滤波的快速密集匹配方法 1.1 Census代价

本文使用具有一定抗差能力的Census测度计算匹配代价。具体过程如下：①针对以目标点和待匹配点为中心的目标窗口和搜索窗口内的像素，根据其与中心像素的灰度关系进行二值编码；②对两编码串按照式（1）计算Hamming距离作为代价：

$ C(i, l)=h\left(C_{r}(I(i)), C_{r}\left(I^{\prime}\left(i_{l}\right)\right)\right) $

(1)

式中，C (i，l) 为i像素视差为1时的原始代价；C_r (·) 为r × r阶的Census特征；I、I'分别为左右灰度影像；i、i_l分别为左影像i像素及视差为l时在右影像上的对应像素；h (·，·)表示Hamming距离。

1.2 候选视差序列的确定

影像上的视差分布并不是简单的均一分布^[27]，考虑到计算全部候选视差的方法效率低，且匹配解空间过大会引起匹配错误，本文首先将稀疏匹配获得的尺度不变特征变换（scale-invariant feature transform，SIFT）点对^[28]转换到核线影像坐标系下，在使用核线约束对其进行检查后，统计其视差分布情况。此外，对原始影像按规则格网进行分块，对其使用相关系数进行匹配，并记录通过左右一致性检查的格网点的视差分布情况。统计上述两种来源的视差出现次数之和，并按每一个视差在匹配结果中出现的频次降序排列，选择前30% 的视差作为候选视差序列。

1.3 基于快速引导滤波的代价聚合

对应某个视差的代价影像是在给定一个视差值的情况下，将每个像素依据当前视差计算得到的代价作为灰度值记录在对应像素位置上所生成的影像。将引导滤波应用于代价聚合就是将左核线影像和代价影像分别作为引导影像和参考影像，依据窗口内的均值、方差计算滤波的加性系数和乘性系数，以此求得滤波代价，代价聚合计算公式如下：

$ C_{i l}^{\prime}=\sum\limits_{j} W_{i j}(G) C_{j l} $

(2)

式中，C_jl为j像素在视差为l时的原始代价；C_il ′为j像素在视差为l时聚合后的代价；G为引导影像；W _ij为j像素对于中心像素i的权，可写成：

$ W_{i j}=\frac{1}{|\omega|^{2}} \sum\limits_{k \in(i, j) \in \omega_{t}}\left(1+\frac{\left(I_{i}-\mu_{k}\right)\left(I_{j}-\mu_{k}\right)}{\sigma_{k}^{2}+\varepsilon}\right) $

(3)

式中，|ω|为归一化系数，即窗口ω_k中的像素数；I_i、I_j分别为引导影像上i、j像素的灰度；μ_k、σ_k²分别为引导影像I在窗口ω_k中的灰度均值和方差；ε为光滑系数。

为加快计算速度，其实际计算公式为：

$ q_{i}=a_{k} I_{i}+b_{k}, \forall i \in \omega_{k} $

(4)

式中，q_i为输出影像上i像素的值；a_k、b_k为由参考影像计算出的线性变换参数，可表达为：

$ \left\{\begin{array}{l} a_{k}=\frac{\frac{1}{|\omega|} \sum\limits_{i \in \omega_{k}} I_{i} p_{i}-\mu_{k} \bar{p}_{k}}{\sigma_{k}^{2}+\varepsilon} \\ b_{k}=\bar{p}_{k}-a_{k} \mu_{k} \end{array}\right. $

(5)

式中，p_i表示参考影像上像素i的灰度值；p_k为参考影像p在窗口ω_k中的灰度均值。

由于参数变化较缓慢，本文先对原始影像降采样计算参数，再将其上采样到原始分辨率来加快计算速度。本文重复使用不变参数，首次计算后存留后用。

2 实验与实验结果

实验系统环境为Windows 10操作系统，8 GB内存，i7中央处理器（central processing unit，CPU），实验数据为由德国Vaihingen地区航摄影像生成的一对像素为2 200×1 400的核线影像以及Middlebury^{[2, 29]}数据集的Cones、Teddy像对。

表 1展示了不同参数组合下FCG方法的处理时间。图 1展示了样例区域原始影像及利用FCG方法在不同实验方案下得到的结果。较大的计算或聚合窗口的抗噪能力强，结果光滑，但消耗时间多。由于FCG方法优化了代价聚合，聚合窗口对时间的影响要小于计算窗口对时间的影响。

表 1 FCG方法在不同参数配置下的处理时间 Tab.1 Processing Time of FCG Method Under Different Parameter Configurations

图 1 不同参数视差图 Fig.1 Disparity Images of Different Parameters

本文将FCG、基于梯度颜色差代价的盒状滤波匹配（matching with gradient and color difference cost and box fiter，GB）、基于梯度颜色差代价的引导滤波匹配（matching with gradient and color difference cost and guided fiter，GG）、Census测度盒状滤波的匹配（matching with Census cost and boxing filter，CB）、基于Census测度引导滤波的匹配（matching with Census cost and guided filter，CG）、Local Expansion（LE）^[30]、倾斜平面光滑立体匹配（slanted plane smoothing stereo，SPSS）^[31]方法进行对比。参数设置见表 2。

表 2 对比实验参数设置/像素 Tab.2 Parameter Setting of Contrast Experiment/pixel

原始影像及不同方法获取的结果见图 2。其中，蓝色方框中楼房边缘深度不连续和遮挡区的局部放大图见图 3，红色方框中视差倾斜变化和纹理重复区的局部放大图见图 4。

图 2 原始影像及不同方法获取的结果 Fig.2 Original Image and Images Obtained by Different Methods

图 3 图 2中蓝色方框局部放大图 Fig.3 Local Enlarged Images of Blue Boxes in Fig. 2

图 4 图 2中红色方框的局部放大图 Fig.4 Local Enlarged Images of Red Boxes in Fig. 2

由图 2可知，FCG方法获得的视差图较为光滑一致，没有明显的噪声或大面积错点，且能较好地保留物体深度不连续的结构。观察红框区域，发现FCG在重复纹理、视差倾斜变化区表现较好。在淡绿色箭头指示区域，GB、GG方法表现出杂乱的视差变化，CB出现了明显的噪声点，而FCG、CG、SPSS、LE方法在此区域均表现优秀。观察蓝框区域可知，GB、CB在深度不连续区域表现不佳，其视差变化未能与边缘保持一致。观察黄色箭头区域，其结果呈现一定扭曲，将使重建房屋扭曲变形。而GB、GG在该区出现了明显亮点，意味着结果中有大异常值，后续生成的三维点云里将会出现飞点。GB、CB方法在全片上还出现了大量错点，而FCG、GG、CG方法表现优秀。FCG、CB、CG方法比GB、GG方法在左上角纹理重复区表现更优。SPSS与LE在上述位置表现优秀，但LE在粉色箭头指示区出现了视差异常线，SPSS在黑色箭头指示区赋予房屋边缘与地面阴影相同的视差。相较于各类局部算法，FCG结果更优，其结果和LE与SPSS相比并未有太大质量退化。

各方法用时见表 3，FCG快于GB、CB方法，比CG快约97%，且其明显快于LE、SPSS，其耗时约为LE的0.4%，SPSS的1%。

表 3 不同方法的处理时间 Tab.3 Processing Times of Different Methods

由于航空影像缺少评价结果的真值，本文使用Middlebury测试数据集中的Teddy、Cones像对评价算法的正确性。虽有区别，但其结果仍可部分反映匹配质量。正确匹配为计算所得视差与真值视差之差小于一个像素。计算结果见表 4，可以看出，FCG效果和原始方法接近，与半全局和全局方法大体相当。

表 4 不同方法在Middlebury数据集Cones、Teddy像对上的正确率/% Tab.4 Correctness Matched by Different Methods of Cones and Teddy Image Pairs in Middlebury Dataset/%

3 结束语

针对航空影像场景复杂，幅面大的特点，本文提出了一种快速并在重复纹理、深度不连续处表现良好的基于快速引导滤波的密集匹配算法。相较于其他局部匹配方法，该方法速度更快，在深度不连续、重复纹理处匹配质量更好；相较于半全局方法和全局方法，该方法在保证匹配质量与其相当的同时，速度上有明显优势。

参考文献

[1]	何豫航, 岳俊. 基于CMVS/PMVS多视角密集匹配方法的研究与实现[J]. 测绘地理信息, 2013, 38(3): 20-23.
[2]	Scharstein D, Szeliski R, Zabih R. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms[C]. Proceedings IEEE Workshop on Stereo and Multi-baseline Vision, Kauai, HI, USA, 2001
[3]	Asta D E, Roncella R. A Comparison of Semiglobal and Local Dense Matching Algorithms for Surface Reconstruction[J]. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2014, 40-5: 187-194.
[4]	Besse F, Rother C, Fitzgibbon A, et al. PMBP: PatchMatch Belief Propagation for Correspondence Field Estimation[J]. International Journal of Computer Vision, 2014, 110(1): 2-13. DOI:10.1007/s11263-013-0653-9
[5]	Tombari F, Mattoccia S, di Stefano L, et al. Classification and Evaluation of Cost Aggregation Methods for Stereo Correspondence[C]. 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 2008
[6]	Fusiello A, Roberto V, Trucco E. Symmetric Stereo with Multiple Windowing[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14(8): 1053-1066. DOI:10.1142/S0218001400000696
[7]	Kang S B, Szeliski R, Chai J X. Handling Occlusions in Dense Multi-view Stereo[C]. 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 2001
[8]	Okutomi M, Katayama Y. A Simple Stereo Algorithm to Recover Precise Object Boundaries and Smooth Surfaces[C]. Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision, Kauai, HI, USA, 2001
[9]	Boykov Y, Veksler O, Zabih R. A Variable Window Approach to Early Vision[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(12): 1283-1294. DOI:10.1109/34.735802
[10]	Gerrits M, Bekaert P. Local Stereo Matching with Segmentation-Based Outlier Rejection[C]. The 3rd Canadian Conference on Computer and Robot Vision, Quebec, Canada, 2006
[11]	Veksler O. Stereo Matching by Compact Windows via Minimum Ratio Cycle[C]. Proceedings 8th IEEE International Conference on Computer Vision, Vancouver, BC, Canada, 2001
[12]	Barnes C, Shechtman E, Finkelstein A, et al. PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing[J]. ACM Transactions on Graphics, 2009, 28(3): 24.
[13]	Gruen A. Adaptive Least Squares Correlation: A Powerful Image Matching Technique[J]. South African Journal of Photogrammetry, Remote Sensing and Cartography, 1985, 14(3): 175-187.
[14]	Lu J B, Li Y, Yang H S, et al. PatchMatch Filter: Edge-Aware Filtering Meets Randomized Search for Visual Correspondence[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(9): 1866-1879. DOI:10.1109/TPAMI.2016.2616391
[15]	Heise P, Klose S, Jensen B, et al. PM-Huber: PatchMatch with Huber Regularization for Stereo Matching[C]. 2013 IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 2013
[16]	Li L C, Zhang S L, Yu X, et al. PMSC: PatchMatch-Based Superpixel Cut for Accurate Stereo Matching[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(3): 679-692. DOI:10.1109/TCSVT.2016.2628782
[17]	Xu S B, Zhang F H, He X F, et al. PM-PM: PatchMatch with Potts Model for Object Segmentation and Stereo Matching[J]. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2015, 24(7): 2182-2196. DOI:10.1109/TIP.2015.2416654
[18]	Hosni A, Bleyer M, Gelautz M. Secrets of Adaptive Support Weight Techniques for Local Stereo Matching[J]. Computer Vision and Image Understanding, 2013, 117(6): 620-632. DOI:10.1016/j.cviu.2013.01.007
[19]	Yoon K J, Kweon I S. Adaptive Support-Weight Approach for Correspondence Search[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 650-656. DOI:10.1109/TPAMI.2006.70
[20]	Hosni A, Bleyer M, Gelautz M, et al. Local Stereo Matching Using Geodesic Support Weights[C]. 16th IEEE International Conference on Image Processing, Cairo, Egypt, 2009
[21]	Rhemann C, Hosni A, Bleyer M, et al. Fast Cost-Volume Filtering for Visual Correspondence and Beyond[C]. 2011 IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA, 2011
[22]	Lukežic A, Vojír T, Zajc L C, et al. Discriminative Correlation Filter with Channel and Spatial Reliability[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017
[23]	Zhang K, Fang Y Q, Min D B, et al. Cross-Scale Cost Aggregation for Stereo Matching[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(5): 965-976. DOI:10.1109/TCSVT.2015.2513663
[24]	Ma H, Zheng S Y, Li C, et al. Cross-Scale Cost Aggregation Integrating Intrascale Smoothness Constraint with Weighted Least Squares in Stereo Matching[J]. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 2017, 34(4): 648-656. DOI:10.1364/JOSAA.34.000648
[25]	Kitagawa M, Shimizu I, Sara R. High Accuracy Local Stereo Matching Using DoG Scale Map[C]. 2017 Fifteenth IAPR International Conference on Machine Vision Applications(MVA), Nagoya, Japan, 2017
[26]	Tombari F, Mattoccia S, Stefano L. Segmentation-Based Adaptive Support for Accurate Stereo Correspondence[C]. Pacific-Rim Symposium on Image and Video Technology, Santiago, Chile, 2007
[27]	Min D B, Lu J B, Do M N. Joint Histogram-Based Cost Aggregation for Stereo Matching[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2539-2545. DOI:10.1109/TPAMI.2013.15
[28]	Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. DOI:10.1023/B:VISI.0000029664.99615.94
[29]	Scharstein D, Szeliski R, Zabih R. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms[J]. Proceedings IEEE Workshop on Stereo and Multi-baseline Vision(SMBV 2001), 2001, 131-140.
[30]	Taniai T, Matsushita Y, Sato Y, et al. Continuous 3D Label Stereo Matching Using Local Expansion Moves[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(11): 2725-2739. DOI:10.1109/TPAMI.2017.2766072
[31]	Yamaguchi K, McAllester D, Urtasun R. Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014


测绘地理信息 2022, Vol. 47 Issue (3): 123-127	0