Precision Work-piece Detection and Measurement Combining Top-down and Bottom-up Saliency
 International Journal of Automation and Computing  2018, Vol. 15 Issue (4): 417-430 PDF
Precision Work-piece Detection and Measurement Combining Top-down and Bottom-up Saliency
Jia Sun1,2, Peng Wang1, Yong-Kang Luo1, Gao-Ming Hao1, Hong Qiao1
1 Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;
2 University of Chinese Academy of Sciences, Beijing 100190, China
Abstract: In this paper, a fast and accurate work-piece detection and measurement algorithm is proposed based on top-down feature extraction and bottom-up saliency estimation. Firstly, a top-down feature extraction method based on the prior knowledge of workpieces is presented, in which the contour of a work-piece is chosen as the major feature and the corresponding template of the edges is created. Secondly, a bottom-up salient region estimation algorithm is proposed, where the image boundaries are labelled as background queries, and the salient region can be detected by computing contrast against image boundary. Finally, the calibration method for vision system with telecentric lens is discussed, and the dimensions of the work-pieces are measured. In addition, strategies such as image pyramids and a stopping criterion are adopted to speed-up the algorithm. An automatic system embedded with the proposed detection and measurement algorithm combining top-down and bottom-up saliency (DM-TBS) is designed to pick out defective work-pieces without any manual auxiliary. Experiments and results demonstrate the effectiveness of the proposed method.
Key words: Work-pieces detection     salient region estimation     top-down and bottom-up saliency (TBS)     calibration     visual measurement
1 Introduction

Precision work-pieces are widely used in many industrial areas, such as consumer electronic products, medical instruments, aeronautic vehicles, etc. The quality of the work-pieces directly affects the functionality and reliability of the products. In advanced manufacturing, production of precision work-pieces is featured by a great variety, large quantity, short cycle time, and high-quality[1, 2]. Along with the improved quality demands, for some workpieces, 100% of products are required to be checked. Meanwhile, the requirement for efficient detection and measurement for such a large quantity of items is increasing constantly. However, most of the quality control is still conducted manually, which has lots of draw-backs such as low consistency, low efficiency, and increasing labor shortages. Therefore, developing automatic detection and measurement techniques for precision work-pieces attracts more and more attention, and worth deep investigation[3].

The reported work-piece detection and measurement approaches can be classified into two categories, i.e., contact methods based on coordinate measuring machine (CMM)[4, 5] and the non-contact methods based on laser sensors[6, 7]. Contact based systems normally consist of a CMM and touch-trigger probe, which acquire point features on the work-pieces surface in an iterative way. The dimensional information of the work-pieces is then obtained by corresponding displacement differences. Though, high precision can be achieved, manual auxiliary operations are required in these methods which greatly decrease the efficiency, and thus can hardly handle batch detection tasks. Besides, contact products may give rise to unexpected damage to the plate surface. To overcome these problems, non-contact methods exist as alternative solutions. For instance, in [6, 7], laser based inspection systems were presented, which were based on laser triangulation by means of a laser stripe in metrological applications. These methods acquire object surface points rapidly, but the precision of inspection is seriously influenced by resolution of laser sensors.

With the rapid development of digital cameras and machine vision, another non-contact method, vision based measurement technique, is widely researched. Digital cameras have the advantages of high resolution, compact dimension and low cost. Therefore, vision based measurement has become an important approach in metrological and inspecting applications[8-12]. Sun et al.[9] presented a vision based inspection system for the top, side and bottom surface of electric contact. Different image preprocessing and inspecting methods were proposed to detect the surface defects, but the quantitative measuring method for electric contact was not mentioned. Maddala et al.[10] reported a vision based measurement approach, which was derived from a superimposed adaptable ring for pill shape detection. An automatic surface defect detection system for mobile phone screen glass was introduced by Jian et al.[11] A contour-based registration method and an improved fuzzy c-means cluster (IFCM) algorithm were developed respectively. Ghidoni et al.[12] proposed an automatic color inspection system for colored wires in electric cables. By means of a self-learning subsystem, the system implemented the inspection process in an automatic way. Neither manual input from the operator nor loading new data to the machine is required. This method will bring about a striking effect only when the objects have obvious color features. These systems based on vision detection described above are not invasive, and can achieve high accuracy. However, in the literature, there is no research for automatic detection of numerous precision work-pieces along the production batch. Moreover, a strong constraint that automatic work-pieces detection algorithm has to abolish is the ability of being real-time.

Automatic precision work-pieces measurement task can be generally divided into two main sub-procedures: work-pieces detection and dimensional measurement. Detection is the primary and essential step, particularly in detection for all work-pieces along the production batch. Missing one work-piece from the detecting sequence will lead to a serious error of following pick-up judgments. If the mistake is not discovered soon enough, all the work-pieces in the batch need to be retested. Therefore, high accuracy of the detection algorithm is very important in this application. Meanwhile, real-time detection requires the algorithm to have high efficiency. Many object detection algorithms have been researched for various applications[13-16]. Among these algorithms, template matching is particularly applicable to the target with prior knowledge, such as shape, size and texture. The object can be found by computing the similarity between the model and all the relevant poses, when the similarity is higher than the given threshold. In [17], the similarity is defined based on the intensity of the template and the search image. This method can obtain the difference of object surface, which can be used for defect detection. However, comparing difference in all pixels is computationally expensive. Feature based methods recognize the object in a more compressed and efficient way than gray value based ones[11, 18]. Points, contours, polygons, or regions can be considered as the features in different applications. For example, in literature [18], edges and their orientations are used as features. The model generated from one reference image of the object consists of the shape and corresponding gradient directions along this shape. This category of algorithms allows a rapid computation, therefore, it is more efficient. However, in the application of on-line detection for considerable number of work-pieces, higher detection efficiency is needed.

In order to further improve the speed of work-pieces detection, visual saliency theories are introduced in this paper. Salient object detection that is considered as a pre-processing procedure in computer vision is widely used in object detection, image retrieval and object recognition[19-21]. Bringing in visual saliency theories has great prospects for real-time detection of the enormous numbers of work-pieces. For decades, many research efforts have been performed to establish the computational salient models, and there are mainly two kinds of models, top-down saliency model and bottom-up saliency model. Top-down salient object detection methods are task-driven by supervised learning with prior knowledge of the object[19]. Bottom-up salient object detection methods are data-driven by lots of image features, such as contour, contrast and texture[20, 21]. A combined detection method of top-down and bottom-up saliency is presented in this paper, which takes advantage of both the high detection accuracy of top-down features extraction and the high detection efficiency of bottom-up salient regions detection.

The motivation of this paper is to design a detection and measurement algorithm for on-line automatic detection and measurement system of precision work-pieces. A real-time work-pieces detection and measurement algorithm combining top-down and bottom-up saliency (DM-TBS) is presented. In the proposed DM-TBS method, prior knowledge of work-pieces is obtained by top-down salient feature extraction to improve the accuracy of detection and a target template is established based on these features for following detection. Then, bottom-up salient region detection based on background contrast for on-line images is proposed, which reduces detection region and increases algorithm efficiency. Finally, the camera with telecentric lens is calibrated and crucial dimension of work-pieces is measured. In summary, the main contributions of this paper include:

1) A real-time automatic detection and measurement system for precision work-pieces is designed.

2) A fast and accurate detection and measurement algorithm named as DM-TBS for work-pieces based on top-down and bottom-up saliency is presented.

3) Practical and comparative experiments are conducted, and the results demonstrate the accuracy and efficiency of the proposed algorithm and system can meet the requirement of manufacturing.

The rest of this paper is organized as follows. The detection task specification and the designed system are introduced in Section 2. Then, the whole detection and measurement process are described. In Section 3, techniques of precision work-pieces detection are presented. Top-down feature extraction and bottom-up saliency detections are described in details. Section 4 gives the method of work-pieces measurement, and the calibration of the visual system is presented. The experiments and results are given in Section 5. Finally, this paper is concluded in Section 6.

High precision three-dimensional metal work-pieces are extremely common in manufacturing. The size of these components is usually smaller than $10 {\rm mm}\times10 {\rm mm}\times5 {\rm mm}$ $(W\times{D}\times{H})$ with high accuracy requirements. Samples of precise work-pieces are shown in Fig. 1. However, due to verity reasons in manufacturing process, the size error of components may be beyond the tolerance requirements. The defect of crucial dimension, such as the size of assembling holes on the bottom of work-pieces, directly influences the performance and reliability of final products. The detection of dimensional defect is helpful to discover systematic process problems, which has great significance to economic efficiency and product quality. Therefore, an automatic and effective detection system is very important for quality control in industry. Metal material has the characteristics of strong reflection, easy-to-break, and difficult to grip. Contact measurement methods are not suitable for these metal work-pieces with a surface coating. The method based on laser sensors may lead to large errors because of the bright reflection of the surface. Therefore, vision based measurement is adopted in this paper.

2.2 System design

The precision work-pieces tasks are summarized as

1) Locate the detection target in the work-pieces sequence.

2) Measure the crucial dimension of current work piece.

3) Eliminate the defective ones automatically.

Automatic precision work-pieces measurement system is designed as shown in Fig. 2. It consists of a rotation disc conveyor, an automatic feeding machine, a locating device, a visual measurement unit (including cameras and corresponding lighting systems), a pick-up mechanism, and a host computer. In visual measurement units, lens system and illumination play very important roles for high-quality image acquisition. The normal lens systems perform a perspective projection of the world. A larger image will be produced when objects are closer to the lens. In order to eliminate the perspective distortions, bilateral telecentric lenses are adopted in this measurement system. In addition, because of the strong reflection of metal objects, a coaxial illumination fitting to telecentric lenses system is used to reduce or prevent specular reflections and shadows.

Work-pieces are conveyed to the rotation disc conveyor by the automatic feeding machine. The locating device adjusts all work-pieces in almost the same position and pose on the disk. Through the rotation disc conveyor, work-pieces arrive at the visual detection units sequentially. Host computer is used to control the whole detection and measurement procedure, including the components feeding control, images capturing and images processing in visual detection unit, and the pick-up mechanism control to eliminate defective ones. The relationship diagram of the host computer and external devices is shown as Fig. 3.

 Download: larger image Fig. 3. Relationship diagram of host computer and external devices

2.3 Framework of proposed detection and measurement algorithm

Vision based work-pieces detection and measurement algorithm is composed of two main parts: Work-pieces detection and work-pieces measurement. The workflow of the proposed algorithm is shown in Fig. 4. The part of work-pieces detection combines top-down feature extraction and bottom-up saliency detection. Work-pieces measurement algorithm consists of camera calibration and dimension measurement. In the next sections, details of work-pieces in detection and measurement will be discussed, respectively.

 Download: larger image Fig. 4. Schematic block diagram of work-pieces detection and measurement

3 Work-pieces detection combining top-down and bottom-up saliency

Both accuracy and efficiency are very required for the application of online detection on enormous numbers of precision work-pieces. Detecting the target work-piece accurately is a prerequisite for following defective work-piece judgment. Meanwhile, effective detection algorithm is the determining factor of system speed. In order to satisfy these two requirements at once, visual attention is introduced.

Humans face a tremendous amount of visual information in the surrounding world every moment, which cannot be completely processed by the visual system. However, humans can scan the environment rapidly and find the target accurately because of the mechanism of visual attention. By identifying salient regions in visual fields, people allocate their attention on the salient regions when viewing complex natural scenes. For many decades, psychologists and computer scientists have done a great deal of researches and presented a number of computational models of salient object detection, which are categorized as either top-down or bottom-up models. Top-down models apply the prior knowledge of the object to achieve a high accuracy searching. Bottom-up models analyze the basic features of image to find the potential location of the object. In the task of work-pieces detection, locating the object and potential area are very important for following extreme position detection and speeding up to realize real-time detection. Therefore, the proposed DM-TBS method combining top-down and bottom-up saliency is presented.

In this section, we first perform top-down feature extraction from prior knowledge of standard work-pieces images, then on-line bottom-up saliency detection method is presented. Finally, strategies of fast and accurate work-pieces detection based on top-down feature extraction and bottom-up saliency detection are described in detail. The flow chart of the detection process is shown in Fig. 5.

3.1 Top-down feature extraction

Precision work-pieces have wealthy prior knowledge, such as contours, shapes and sizes. Therefore, these features can be extracted from standard work-pieces off-line and the template is established by one or more images or computer aided design (CAD) model of the qualified object itself.

In images of precision work-pieces acquired on-line, contour characteristics are the most robust. Therefore, contours are adopted as the feature to establish the template. Contour extraction is very important, especially for work-pieces with fillet. The fillet of work-piece usually leads to fuzzy edges, which is difficult to extract[22, 23]. The fuzzy edges will increase the computational complexity in the following template creating step. In addition, similarity measure between template contour and target contour will be reduced. Therefore, an improved contour extraction method is discussed to detect accurate single pixel contour. In order to improve the efficiency of contour extraction, top-down saliency detection is applied. Top-down salience models are task-driven, which rely on the prior knowledge of the test image.

Firstly, in order to remove noise and enhance contrast of detection object in image, image filtering and image enhancement are adopted in the step of image pre-processing. Median filtering is a kind of nonlinear digital filtering technique. By means of a median filter, important region contours and details are preserved, at the same time, noise of the image is removed. Then, histogram equalization[24] is applied to enhance the details of the image.

After above image pre-processing, top-down salience detection is implemented to reduce following process region. Afterwards, Canny operator is used to detect the original contour of object[25]. Because of the fillet of component, the contours extracted in the image are usually redundant, multiple, and having false edges. In the next step of target searching, more pixels of contour lead to more computational complexity. Meanwhile, false contour will reduce the accuracy of searching results. In order to obtain single pixel contour, morphology is employed to optimize the original contour extracted by Canny operator. Firstly, dilation is utilized to combine the fuzzy edges. Then, through skeletonizing these edges and combining collinear ones, one-pixel contour will be obtained.

The optimized contour is consistent, integrative, and minimal, which is appropriate for template creating.

3.2 Bottom-up saliency detection based on background contrast

After establishing an accurate template of work-pieces off-line, the effective method of searching this template in on-line images is considered. Searching an object in the whole image is computationally expensive. Consequently, reducing the searching region is obviously a feasible method to decrease the computational complexity. Salient regions, which are distinct from the image background, can perfectly cover the potential area where the objects are located. Therefore, saliency detection algorithm before object searching is discussed, which will facilitate to improve detection efficiency.

Bottom-up saliency detection is data-driven, which relies on low-level image characteristics without prior knowledge about the object. Because of the difficulty of background determination, existing methods calculate saliency applying its contrast to local neighborhoods or the whole image. In our applications, work-pieces are normally in the middle of the image. Therefore, the regions near the image boundary are considered as background.

First, in order to extract the image boundary, we use simple linear iterative cluster (SLIC) superpixel segmentation method to over segment the image[26]. Compared with traditional grid image segmentation method, superpixel segmentation can preserve the integrity of a salient object and reduce the number of image blocks. Then, the superpixels, which touch the boundary of the image, are considered as background elements, and we denote this set of elements as $B.$

After extracting image boundary $B$, the contrast against the set of image boundary superpixels is computed as the measurement of region saliency, which is defined as

 $$$contrast({p_i}) = \sum\limits_{j \in B} {d({c_i}, {c_j})} \exp\left ( - \frac{{d({l_i}, {l_j})}}{{2\sigma _l^2}}\right)$$$ (1)

where $d({c_i}, {c_j})$ is the Euclidean average color distance in the color space between superpixel ${p_i}$ and ${p_j}$, which is the description of contrast based on gray characteristic, $d({l_i}, {l_j})$ is the Euclidean spatial distance between superpixel ${p_i}$ and ${p_j}$. ${\sigma_l}$ is used to control the sensibility of the spatial distance for contrast.

3.3 Work-pieces detection combining top-down and bottom-up saliency

As described in the previous section, work-pieces own rich prior knowledge. We apply top-down feature extraction method to obtain crucial features of work-pieces and a standard template is established based on these features. By calculating the similarity between work-pieces template and the images acquired on-line, the target work-piece can be detected through setting an appropriate threshold. In order to realize real-time detection, bottom-up salient region detection based on background contrast for on-line images is proposed. By reducing detection region of on-line images, the detecting efficiency is able to be further improved.

In this subsection, the calculation of similarity between the template established by top-down feature extraction and salient region of searching images is discussed. At last, a hierarchical search strategy based on the image pyramids is presented to increase the speed of algorithm[15].

The optimized contour of the template, which is generated from an image of the object, consists of a set of points ${p_i} = {({r_i}, {c_i})^{\rm T}}$ and corresponding direction vectors ${d_i} = {({t_i}, {u_i})^{\rm T}}$, $i = 1, 2, \cdots, n$. The direction vectors can be obtained by Canny operator. In the searching image, direction vector ${e_{r, c}} = {({v_{r, c}}, {w_{r, c}})^{\rm T}}$ is calculated for each image point $(r, c)$. In order to find the template with rotation, a linearly transformed model is representated by points $p_i = A{p_i}$ and direction vectors $d_i = A{d_i}$, where

 \begin{align*}A = \left( {\begin{array}{*{20}{c}} {\cos \theta }&{ - \sin \theta }\\ {\sin \theta }&{\cos \theta } \end{array}} \right). \end{align*}

The similarity measure is a measurement for comparing the transformed template to the image. It should be robust to the presence of occlusion, clutter, and nonlinear illumination changes. The similarity measure s, which can achieve these requirements, is computed as follows:

 \begin{align} s =& \frac{1}{n}\sum\limits_{i = 1}^n \frac{d_{i'^{\rm T}}e_{q+p'}}{\|d'_i\|\|e_{q+p'}\|}=\nonumber\\ & \frac{1}{n}\sum\limits_{i = 1}^n\frac{t'_iv_{r+r'_i, c+c'_i}+u'_iw_{r+r'_i, c+c'_i}}{\|t'^2_i+u'^2_i\|\|v^2_{r+r'_i, c+c'_i}+w^2_{r+r'_i, c+c'_i}\|} \end{align} (1)

where $d'_i{e_{q + {p'}}}$ is the dot product of the direction vector of the transformed template and the image at the point $q = {(r, c)^{\rm T}}$. The similarity measure s returns a number no more than 1 as the result of matching. When the template is totally the same as the image, the score of matching s is equal to 1. Through setting a threshold $s_{\min}$ on the similarity measure, whether the template is found in the image can be determined. Actually, the similarity measure is not needed to be evaluated completely, it can be ended earlier. Let $s_j$ denote the partial sum of the dot products up to the j-th element of the template, then:

 $$${s_j} = \frac{1}{j}\sum\limits_{i = 1}^j {\frac{{d_i'^{\rm T}{e_{q + {p'}}}}}{{\| {d'_i} \|\| {{e_{q + {p'}}}} \|}}}.$$$ (3)

Because the sum of the normalized dot products is all no more than 1, the partial score $s_j$ will never achieve the threshold $s_{\min}$, when $s_j < s_{\min} - 1 +\frac{ j }{ n}$. This stopping criterion speeds up the matching process considerably.

The evaluation of the similarity measures on the entire image is very time-consuming, even if the stopping criteria discussed above are used. In order to gain a speed-up, we can try to reduce the number of poses that need to be checked as well as the number of template points. Image pyramid scales down the image and the template multiple times by a factor of 2, to create a data structure, as shown in Fig. 6.

In this paper, the mean filter is adopted to construct image pyramids. The hierarchical search strategy is explained as follows:

Algorithm 1. The hierarchical search strategy based on image pyramid.

Input: Template image ${\boldsymbol T}$, Search image ${\boldsymbol M}$

Output: flag: 1 object is found; 0 no object is found, Position ${\boldsymbol P}$

1) Calculate an appropriate number of image pyramids ${\boldsymbol N}$

2) Generate ${\boldsymbol N}$ template pyramid images $T_i (r_i, c_i, \theta_i)$

3) Generate ${\boldsymbol N}$ search pyramid images ${\boldsymbol M}_i$, $i=1, 2, \cdots, N$

4) Initialize $i = {\boldsymbol N}$, search region $R_i (r_i, c_i, \theta_i) = M_i$

5) for $i=N$; $i>0$; $i--$ do

6)     Calculate the similarity measurement s

7)     between $T_i (r_i, c_i, \theta_i)$ and $M_i$ in region $R_i$ as (2)

8)     if $s_m > s_{threshold}$ then

9)       The object is found at the i-th layer:

10)      $flag_i = 1$ $R_i (r_i, c_i, \theta_i)$ is constructed around the

11)       match in the lower pyramid level to avoid the

12)       uncertainty in the location of the match

13)    else

14)         Break

15)    end if

16) end for

17) if $i=1$ then

18)     $\textbf{flag} = 1$, $P = (r_i, c_i, \theta_i)$

19) else

20)     $\textbf{flag} = 0$

21) end if

22) return flag, position ${\boldsymbol P}.$

Using image pyramid can greatly improve the computational efficiency. If we perform the complete matching, for example, on level 4, the amount of computations can be reduced by a factor of 4 096.

4 Measurement of dimension of work-pieces 4.1 Calibration of camera system with telecentric lens

In the measurement of precision work-pieces, high measurement accuracy is demanded. Because of the characteristics of purely orthographic projections, constant magnification and very small distortion, telecentric lens system is chosen in the proposed measurement system. In order to obtain the actual dimensions of work-pieces, camera system with telecentric lens need to be calibrated first. In this section, we present a convenient and practical method for telecentric lens system calibration.

In bilateral telecentric lens system, two telecentric lenses are located separately at two sides of a small aperture stop. The distance between the two lenses is the sum of their focal lengths, and the aperture stop is exactly placed in the focal plane between the two lenses. Therefore, only the light rays and the emergent rays are both approximately parallel to the optical axis of the lens. The magnification k is a significant parameter of a bilateral telecentric lens system, therefore, it must be calibrated precisely.

In visual measurement, the relative distance between two points is more practical than the absolute positions of the points. Therefore, we can utilize points with known distances to calibrate the visual system. Suppose there are $n+1$ points in the work plane. Set one point as the reference point, then, the image difference and the space difference between the rest $n$ points and the reference point have the following relationship:

 $$$\left[ {\begin{array}{*{20}{c}} {\Delta {u_i}}\\ {\Delta {v_i}}\\ 1 \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} k&0&0\\ 0&k&0\\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {\Delta {x_{wi}}}\\ {\Delta {y_{wi}}}\\ 1 \end{array}} \right], \quad i=1, 2, \cdots, n.$$$ (4)

In order to facilitate the practical application, we use the distance between two points to calculate the parameter k:

 $$$k = \frac{{\sqrt {\Delta {x_w}^2 + \Delta {y_w}^2} }}{{\sqrt {\Delta {u^2} + \Delta {v^2}} }}.$$$ (5)
4.2 Measurement for dimension of work-pieces

In industrial manufacturing, precision work-pieces need to assemble with other components. The error in a crucial dimension, such as the size of assembly holes on the bottom of workpieces, directly influences the performance and reliability of the products. Therefore, it is of great significance to measure these crucial dimensions precisely.

Take the assembly holes on work-pieces for example, the measurement steps are as follows: First, segment object from image; Second, extract the contour of the hole and remove obvious outliers; Finally, fitting the contour to circles by minimizing the sum of the squared distances of the contour points to the circle.

 $$${\varepsilon ^2} = \sum\limits_{i = 1}^n {{{\left( {\sqrt {{{({r_i} - \alpha )}^2} + {{({c_i} - \beta )}^2}} - \frac{d}{2}} \right)}^2}}.$$$ (6)

Since the visual system have been calibrated in advance, the actual size of the hole can be obtained as

 $$$D = k\times{d}$$$ (7)

where D is the actual diameter of the hole and d is the pixel diameter of the hole.

4.3 Measurement procedure of work-pieces dimension

To measure the dimension of a work-piece, the object detection process is performed first. Then, by extracting and fitting contours of the work-pieces, pixel dimension of the work-piece can be obtained. Finally, transform the pixel result to actual dimension by calibration parameters. The complete procedure of precision work-pieces detection and measurement algorithm combining top-down and bottom-up saliency is described as follows.

Algorithm 2. Precision work-piece detection and measurement combining top-down and bottom-up saliency.

Input: Template image $f_t (r , c)$, Search image $M(r , c)$

Output: Flag (1 object is found; 0 no object is found), Diameter $D$

1) a) Top-down feature extraction

2) Pre-processing of $f_t (r , c)$

3) Canny edge detection$\rightarrow$ $f^{(c)}(r , c)$

4) Optimize edges$\rightarrow$ $f^{(p)}(r , c)$

5) Establish template edges$\rightarrow$ $T(r, c)$

6) b) Bottom-up saliency detection

7) Superpixel segmentation to ${\boldsymbol M}(r , c)$ $\rightarrow$$p_i$, $i=1, 2, \cdots, N$

8) Image boundary ${\boldsymbol B}=b_i$

9) for each point $p_i$

10)    Calculate the measurement of region saliency as a)

11) end for

12) get salience region $R_c$

13) c) Work-pieces searching

14) Initialize pyramid layer $i = N$

15) search region $R_i (r_i, c_i, \theta_i) = R_c$

16) for $i=N$; $i>0$; $i--$

17)     Calculate $s$ between $T_i (r_i, c_i, \theta_i)$

18)     and $M_i$ in region $R_i$ as (2)

19)     if $s_m > s_{threshold}$

20)         flag at the i-th layer: $flag_i = 1$

21)     else

22)         Break

23)     end if

24) end for

25) if $i=1$ then

26)     $\textbf{flag} = 1$, $P = (r_i, c_i, \theta_i)$

27) else

28)     $\textbf{flag} = 0$

29) end if

30) d) Crucial dimension measurement

31) Camera calibration

32) Circle fitting and get pixel diameter d

33) Calculate the actual diameter $D=k\times d$

34) return flag, position ${\boldsymbol P}$.

5 Experiments and results 5.1 Experiment system

According to the system designed in Section 2, an experimental system was established, as shown in Fig. 7. Vision unit is consisted of a charge coupled device (CCD) camera and a telecentric lens system with a magnification of 0.057$-$0.5x. The CCD camera was MER-200-2-GM/GC, which captured images at the rate of 15 frames per second with an image size of $1 628\times1 236$ pixels. The camera was placed perpendicular to the detection platform. The coaxial collimated lighting source was placed between the camera and the component to be detected. The CPU of host computer was Intel ${\rm Core^{TM}}I7-6 700$ with frequency of 3.4 GHz.

5.2 Feature extraction experiments

Samples of precision work-piece to be detected are shown in Fig. 1. We choose the first type samples that have a hole in the bottom of the work-pieces in the following experiments.

In order to establish the template contour, we applied Canny detector to extract original contour in the template image. The parameters of Canny operator were $\alpha = 1$, $threshold_{low} = 20$, $threshold_{high} = 40$. As shown in Fig. 8 (b), the original contour was redundant, multiple, and containing false edges. Dilation and skeletonization were operated on the original contour as described in Section 3.1. The results of optimized contour were consistent, integrative, and minimal, as shown in Fig. 8 (c).

5.3 Saliency detection experiments

The number of pixels in the search image influences the computation complexity directly in work-pieces detection. Reducing the search region is a practical method to increase matching efficiency. Therefore, the saliency region detection method described in Section 3.2 was implemented. First, over-segment the input image with superpixel segmentation method. The superpixel size was set to 600 pixels, and the image elements were shown in Fig. 9 (b). Then, we extracted the boundary of the image (as shown in Fig. 9(c)) and computed contrast saliency against these boundary elements. Then, the salient region was segmented as shown in Fig. 9 (d).

5.4 Work-pieces detection experiments

According to the work-pieces detection algorithm presented in Section 3, the practical experiments are implemented in this section. Two measures were adopted to evaluate the performance of our method: Algorithm runtime and similarity score. Runtime is algorithm computation time, and smaller values equate to higher efficiency. Similarity score described in Section 3.3 is a percentage, and higher values indicate better matches. There were two main parameters which influence the performance of DM-TBS algorithm. One is the number of pyramid layers chosen in the process of work-pieces searching. The other one is the window size of dilation operation in the process of template establishment.

As introduced in the previous section, image pyramid can reduce computation complexity. Ten work-pieces were chosen as samples to perform detection experiments with the template established previously. For each sample, seven independent trials were performed with different pyramid layers, from 1 layer to 7 layers. The runtimes of the algorithm with different pyramid layers were recorded, as shown in Table 1.

Table 1
Runtimes of algorithm with different pyramid layers (ms)

In order to display the results visually, we draw the curve of algorithm runtimes with different pyramid layers as shown in Fig. 10 (a). The average runtimes of ten samples with different pyramid layers are shown in Fig. 10 (b). As we could see from Figs. 10 (a) and (b), when 1 pyramid layer was chosen, the algorithm runtimes were up to 2 000 ms. When 2 pyramid layers were employed, the algorithm runtimes were approximately 100 ms, which show a significant decrease. From 3 to 5 layers, the algorithm runtimes still had a small decline. In addition, after 5 layers, the results no longer dropped down but were nearly the same. The trend of the resulting curve indicated that image pyramid can improve the efficiency of the algorithm effectively. However, excessive layering cannot decline the runtime further. Instead, it may lead to losing the shape feature of the object in the top layer and fail the matching. Therefore, it is important to choose an appropriate number of image pyramid layers. In this experiment, 4 is considered suitable.

 Download: larger image Fig. 10. Results of algorithm runtime with different pyramid layers

The relationship between similarity scores and the pyramid layers was also experimented with 7 groups of trials. The results are shown in Table 2. Meanwhile, the curve of similarity scores with different pyramid layers was drawn as shown in Fig. 11 (a). The average similarity score of each number of layers was shown in Fig. 11 (b).

Table 2
Similarity scores with different pyramid layers (%)

 Download: larger image Fig. 11. Results of similarity scores with different pyramid layers

As we can see in Table 2, the maximum value of the similarity score is 88.63 %, and the minimum value is 62.40 %. From Fig. 11 (a), each sample had almost the same value in different pyramid layers. Therefore, the difference of the results is caused by the shape of each sample not the number of pyramid layers. Therefore, it can be observed that the number of pyramid layers influence similarity score limitedly.

As introduced in the previous section, another aspect that influences the algorithm effects was the size of dilation window. In the stage of establishing the template, the size of dilation window determines the contour of the template directly. Therefore, ten work-pieces were selected as samples to perform matches with the template. Five templates were made by different size of dilation windows from the same image. The sizes of dilation windows were $4\times4$, $6\times6$, $8\times8$, $10\times10$ and $12\times12$. The runtimes of algorithm with different size of dilation windows were recorded, as shown in Table 3. In these experiments, the pyramid layer was set to 4 layers.

Table 3
Runtimes with different size of dilation window (ms)

To display the results visually, we draw the curve of algorithm runtimes with different size of dilation window as shown in Fig. 12 (a). The average runtime of ten samples with different size of dilation window is shown in Fig. 12 (b). As we can see in Table 3, the maximum value of algorithm runtime is 4.75 milliseconds, and the minimum value is 3.11 milliseconds. Among all these experiments, the maximum difference is 1.64 milliseconds. The relationship between algorithm runtime and the size of dilation window is not simple scaling relationship, from Figs. 12 (a) and (b). Therefore, size of dilation window has little effect on algorithm efficiency.

 Download: larger image Fig. 12. Results of algorithm runtime with different size of dilation window

The relationship between similarity scores and the size of dilation window was also experimented with 5 groups of trials. For each group, ten work-pieces were selected to match with the templates established by different size of dilation windows from the same image. The sizes of dilation windows were $4\times4$, $6\times6$, $8\times8$, $10\times10$ and $12\times12$. The results are listed in Table 4.

Table 4
Similarity scores of template matching with different sizes of dilation window (%)

The curve of similarity scores for different size of dilation window was drawn in Fig. 13 (a). The average similarity scores of ten samples with different size of dilation window are shown in Fig. 13 (b). The maximum value of each sample is marked in Table 4. Obviously, the maximum values appear 7 times in the size of dilation window $6\times6$ out of 10 group experiments. The same conclusion can be obtained from Fig. 13 (b) that the average value of the similarity score reaches the peak at dilation window size of $6\times6$. The samples used in experiments were work-pieces with fillet, and the width of fillet was 5 to 6 pixels. Therefore, dilation window $6\times6$ can optimize the fuzzy edges most perfectly. Smaller dilation window cannot combine all the small edges together, while bigger dilation window might combine false edges (such as edges of surface defect) into the real one. Therefore, size of dilation window should be chosen closely to the width of the work-piece fillet.

 Download: larger image Fig. 13. Results of similarity scores with different size of dilation window

5.5 Work-pieces detection comparative experiments

In order to analyze the efficiency of the proposed DM-TBS algorithm, comparative experiments with two other detection algorithms were performed, which were gray-value based method in [17] and contour-based method in [11]. Ten work-pieces were selected to execute the detection by different methods in the experiments. The algorithm runtimes are listed in Table 5, and the corresponding curves of the comparison results are drawn as shown in Fig. 14.

Table 5
Comparison results of different methods runtimes (ms)

 Download: larger image Fig. 14. Curves of comparison results of different methods runtimes

As we can see in Table 5 and Fig. 14, the average algorithm runtimes of the method in [17], the method in [11] and ours are: 9.07 ms, 4.215 ms and 4.094 ms, respectively. Compared with the method in [17], which is gray-value based method, contour-based method in [11] is faster in computation speed, because the pixel numbers of the object contour are much less than the pixel numbers of the object surface. In addition, our method applying top-down and bottom-up salience detection reduces the search region. Therefore, the computation efficiency is further improved.

Accordingly, matching similarity scores are listed in Table 6, and the corresponding curves of the comparison results are drawn as shown in Fig. 15.

Table 6
Comparison results of different similarity scores (%)

 Download: larger image Fig. 15. Curves of comparison results of different methods similarity scores

As shown in Table 6 and Fig. 16, the average similarity scores of the method in [17], the method in [11] and ours are: 61.684 %, 66.028 % and 76.471 %, respectively. The results demonstrate that our method has the highest similarity score. If we set 60 % as a threshold to classify an object is found successfully or not, then the success rates of work-pieces detection are 60 %, 70 % and 100 %. Therefore, the results of experiments illustrate that the method proposed in this paper is characterized by high computation speed and outstanding effects.

5.6 Measurement experiments for dimension of work-pieces

According to measurement method presented in Section 4, the camera system with telecentric lens should be calibrated first. Pixel equivalents of the camera were gained by a microcalliper as follows:

 \begin{align} k = &\frac{1}{n}\sum\limits_{i = 1}^n {\frac{{\sqrt {\Delta {x_{Wi}}^2 + \Delta {y_{Wi}}^2} }}{{\sqrt {\Delta {u_i}^2 + \Delta {v_i}^2} }}} =\nonumber\\ & \frac{1}{n}\sum\limits_{i = 1}^n {\frac{1}{{\Delta {d_i}}}} = 0.008 977 \end{align} (8)

where, $\delta{d_i}$ is the pixel distance of the i-th scale span.

In this experiment, diameter of the assembly hole on the bottom of the work-piece was selected as the crucial dimension to conduct size measurement using the method described in Section 4. Ten work-pieces were chosen as samples, and the ground truth was obtained from micrometer. The measurement results are listed in Table 7.

Table 7
Results of hole diameter on work pieces

Figs. 16 (a) and (b) show the absolute errors and the relative errors of measurement results visually, respectively. From Table 7, the maximum absolute error of the diameter is 0.035, and the maximum relative error is 0.51 %. The results demonstrate that the measurement precision of work-pieces critical dimension can meet the requirement of manufacturing.

6 Discussions

A novel precision work-pieces detection and measurement method combining top-down and bottom-up saliency is presented in this paper. By means of this algorithm, a real-time automatic detection and measurement instrument is designed. Template creation by top-down feature extraction ensures the accuracy of detection, and reduction of searching region by bottom-up saliency detection improves the efficiency of the algorithm. Practical experiments and comparative experiments are conducted, and the results illustrate that the proposed work-pieces detection method is characterized by high efficiency and good effects. In addition, calibration of the visual system with telecentric lens is discussed. Crucial dimensions of work-pieces are measured whose maximum relative error is less than 0.51 %. The measurement precision can meet the requirement of manufacturing. Furthermore, the proposed detection algorithm DM-TBS is an important reference for similar object detection tasks, such as pill detection, sugar detection and so on.

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61379097, 91748131, 61771471, U1613213 and 61627808), National Key Research and Development Plan of China (No. 2017YFB1300202), and Youth Innovation Promotion Association Chinese Academy of Sciences (CAS) (No. 2015112).

References