In order to test a software system completely, we should make the functional detection for all kinds of system element combinations. If a software system under test (SUT) has
![]() |
Table Ⅰ A SUT WITH THREE TEST PARAMETERS |
Fortunately, some researches show about 70
In this study, we propose a novel test suite generation and global optimization approach. In the first step, the test suite generation problem has been transformed into a binary sequence optimization problem by a binary encoding mechanism. Based on it, a novel evolutionary algorithm, cluster searching algorithm (CSA), is presented to optimize the binary sequence in solution space so as to construct the optimal test suite. In Section 2, we analyze the related research works. Section 3 introduces the problem transformation. Section 4 shows the CSA. Section 5 provides the simulations. Section 6 gives the conclusion.
2 Related WorkGenerally, a test case subset, which can satisfy all covering requirements, is known as a representative set. Assuming that the cost of generating and managing each test case is the same, a representative set with a minimum number of test cases is desirable and is called the optimal testing suite. The optimal combinatorial test suite generation problem is defined as, given a SUT and a set of combinatorial covering criterion, find a minimal representative set from the complete test case set. For example, Table Ⅱ is an optimal pairwise combinations representative set, which can cover all combinations between any two system elements of the SUT in Table Ⅰ. In general, this problem can be expressed as a set-covering problem [4], and it is a NP-complete problem also [5].
![]() |
Table Ⅱ AN OPTIMAL TEST SUITE COVERING ALL PAIRWISE COMBINATIONS |
In traditional studies, people make use of some mathematical methods to construct the representative set, such as orthogonal array [6]. But, there are some unsolved problems existing in the orthogonal array generation method, such as we cannot generate a necessary orthogonal array for any kind of SUT. Another mathematical method is based on the matrix recursive construction process [7]. For some special instances, this method can give a wonderful result. But, it cannot be applied to all kinds of SUT either. Moreover, the theoretical researches of combinatorial testing are still unable to give an explicit optimal covering number for most problems. The major conclusions of such studies just can give some logical relations for the optimal covering number between different problems [8].
In recent years, the evolutionary algorithm (EA) has developed into a powerful global optimization algorithm to solve many complex engineering optimization problems [9]
Recently, efforts have been focused on the use of meta-heuristic algorithms as part of the computational approach for test suite generation. Generally, meta-heuristic algorithms start with a random set of test cases, such as using a simple one-test-at-a-time greedy procedure to construct it [16]. Then, the initial test case set undergoes a series of transformations in an attempt to improve itself. Each of them could be an independent meta-heuristic algorithm, such as teaching learning based optimization, global neighborhood algorithm, particle swarm optimization, and cuckoo search algorithm [17]. In this process, one best candidate is selected at each iteration until all the required interactions are covered. Such researches show a good performance for large-size constrained CA problems as well. But, these algorithms always contain several searching algorithms and try to balance them in the iteration computation process. So, these approaches often have a relatively heavy and complicated structure and are more suitable to solve large-size CA problems.
By using one-test-at-a-time mechanism or meta-heuristic algorithms, it is possible to perform large combinatorial testing that was not possible before. But, for small-size CA problem, such approaches still imply a high cost and often get an approximate result. In the practical applications, a portion of software testing works belongs to the small-size problem. So, finding a simple method to improve the probability of generating the optimal test suite for small-size combinatorial testing is a worthy goal. In order to optimize the test suite more effectively, some researchers try to translate small-size CA problem into another kind of problem to solve it, such as satisfiability problem (SAT) [18] and integer program problem [19] etc. However, these researches also meet some difficult challenges. For example, even translating a small-size SUT into a SAT problem, we will get a very large clause set. Besides, the normal planning algorithm cannot deal with a big optimization problem efficiently. Above researches enlighten us that the combination test suite generation problem should be translated into a simple and concise data structure for global optimization, meanwhile, it is necessary to equip an effective global optimization algorithm to improve the solution quality.
3 Proposed MethodFig. 1 is the flowchart of combinatorial test suite global optimization mechanism. As it shows, a one-to-one correspondence is created between a test case in its complete set and a gene of a binary code string. Based on this, we can create a mapping relation between a test case subset of complete set and a vertex in the binary code space. This means we can translate a combination test suite generation problem into a binary code based global optimization problem to solve. The binary string is a simpler and more compact data structure. Moreover, it is more suitable for global optimization computation. In this section, we will introduce the encoding and decoding procedure firstly.
![]() |
Figure 1 Combinatorial test data global optimization mechanism. |
Generally, we can use a covering array (CA) or a mixed level covering array (MCA) to describe a SUT [4]. The difference between CA and MCA is that each test parameter of CA has the same value range, but in MCA it can be different. Actually the CA can be looked upon as a special case of MCA, and the processing methods for them have no difference. To facilitate the computation process, we make this study for CA problem only. The following is the definition of covering array given by Cohen [20].
Definition 1 (Covering array): Let
By this definition, we can get that the number of test cases in the complete set of
Definition 2: Let
By this definition, the complete test case set
![]() |
Table Ⅲ AN ORDERLY COMPLETE TEST CASES SET |
Definition 3: The binary code sequence
As Fig. 1 shows, the binary code space is an
Definition 4: Let
Based on above definitions we can get 2 characteristic theorems of set
Theorem 1: Let
Proof: By Definition 3, we can get there are
Theorem 2: Make test case
Proof: Firstly, we can prove both
The bijection relationships between set
The decoding mechanism is used to parse the binary code sequence so as to construct the corresponding combinatorial test case subset. To facilitate this process, we use a positive integer, whose value is set from 0 to
Definition 5: For a covering array
Let
$ \begin{align} j=x_{k-1}\ast v^{k-1}+ \cdots + x_1\ast v^1+x_0\ast v^0 \end{align} $ | (1) |
For example, to the CA in Table Ⅰ, if the gene value of
Algorithm 1 Decoding |
Input: positive integer |
Output:test suite TS |
1: for |
2: if |
3: |
4: for |
5: |
6: |
7: end for |
8: put |
9: end if |
10:end for |
In above procedure, TS =
In the proposed approach, the CA problem can be regarded as a sort of optimization problem. So, we can formulate
$ \begin{align} \max f(\lambda_i), \quad \lambda_i\in\Lambda. \end{align} $ | (2) |
By Theorem 2, we know
Definition 6: the fitness of
$ \begin{align} f(\lambda_i)=f(\gamma_i)=f_i. \end{align} $ | (3) |
Since we expect to use as few test cases as possible to cover whole combinations with strength-
$ \begin{align} f_i =10\ast\frac{\omega}{|CS|}+1-\frac{|TS|}{|\Phi|} \end{align} $ | (4) |
where
Besides, for handing constraints, a specific calculation mechanism is used to filter out all invalid genes from
Definition 7: The constraint set
$ \begin{align} I_{co} = I_1\wedge I_2\wedge \cdots \end{align} $ | (5) |
where
Using
$ \begin{align} \lambda_i =\lambda_i \wedge I_{co}. \end{align} $ | (6) |
After this calculation process, the fitness of
Based on fitness function, the value of
Theorem 3: The
This theorem gives a sufficient condition for the problem transformation. Above all, we can get a conclusion that a combination test suite generation problem can be translated into a binary code based global optimization problem to solve.
4 Cluster Searching AlgorithmAs we know, the solution of EA is much likely to be affected by the phenomena of premature convergence and searching stagnation. The adaptive mechanism is the most commonly used self-adjusting method in EA. Nevertheless, in a practical application, it is difficult to make a reliable and accurate self-adjusting strategy for EA. So, the adaptive mechanism just can make a limited impact on the performance improvement of EA. Recently, a new research trend is aimed at creating a hybrid algorithm model by merging various searching mechanisms and operators in order to improve the performance of optimization system [21], [22]. Meanwhile, a new problem has emerged. That is how to balance the diversity and the complexity in a hybrid evolution system and harmonize the specificity and the coordination among multiple operators. We find that an organization with sort of cluster form structure in a complicated and huge evolutionary group has shown a particularly important role in the occurrence and development process of such group. Inspired by this, we propose to create a certain cluster organization in the population, and use it to control and adjust the searching process of population. Fig. 2 is the model of cluster searching algorithm (CSA).
![]() |
Figure 2 Model of cluster searching algorithm. |
In CSA, the cluster
Definition 8: A population Pop of CSA consists of an
Definition 9: The cluster
$ C=\{C_1, C_2, \ldots \}, \ \ C_i=\{A_i^1, A_i^2, \ldots \} \notag\\ \qquad\qquad\qquad\qquad\quad A_i^j=k, \ \ i, j, k=1, 2, \ldots $ | (7) |
$ \emptyset=\{C_i\cap C_j|i\neq j\}, \ \ \emptyset=\{A_i^r\cap A_j^s|i\neq j\}\notag\\ Pop=\cup C_i, \ \ i, j, r, s=1, 2, \ldots. $ | (8) |
Definition 10: In cluster
$ \begin{align} & Distance(A_i^r, A_j^q) > Distance(A_i^r, A_i^s), \notag\\ & \qquad\qquad\qquad\qquad\qquad i\neq j, \ \ i, j, r, s, q=1, 2, \ldots. \end{align} $ | (9) |
Definition 11: The core member
$ \begin{align} f_{A_i^r}\leq f_{A_i^c}\quad \forall\, A_i^r \in C_i, \ \, i, r=1, 2, \ldots. \end{align} $ | (10) |
In the iteration process of CSA, the cluster, created in
1.
2. While (the termination criteria are not reached) do
3.
4.
5.
6.
7. End while.
In the first step, CSA initializes the population
In the searching among group process, we choose an individual from different groups and use multi-point crossover operation
Algorithm 2 Searching among group |
Input: population |
Output: |
1: Randomly select two different groups |
2: Perform |
3: The number of offspring individuals |
After the searching among group process, 2
Based on the bound researches of optimal test case number, such as symbol-fusing and the lower and upper bound [23], [8], we can see that the valuable genes, whose value is 1, just have occupied a small proportion in binary code. In order to emphasize the importance of these valuable genes in clustering process, we propose to use positive attribute distance (PAD) [24] instead of traditional hamming distance to calculate the individual distance.
The PAD of two binary sequences is as follows
$ \begin{align} PAD(I_i, I_j) = 0\leq \frac{2\Psi_{ij}}{\Psi_i+\Psi_j} \end{align} $ | (12) |
where
Furthermore, we use the average PAD of population as the indicator of population diversity
$ \begin{align} \gamma=\frac{2}{|n|(|n|-1)}\sum\limits_{i=2}^{|n|}\sum\limits_{j=1}^{i-1}PAD(I_i, I_j). \end{align} $ | (13) |
Algorithm 3 is the main steps of a hierarchical clustering process. In this process, we set two matrixes, IDM and GDM, to store the individual distance
$ idm_{ij}=\begin{cases} 0,&{\rm if} ~i\leq j, \ \, i, j=1, 2, \ldots \\ PAD(I_i, I_j),&{\rm else} \end{cases} $ | (14) |
$ gdm_{ij}= \begin{cases} 0,&{\rm if} \ i\leq j, \\ &\quad i, j, r, s=1, 2, \ldots \\ \frac{\sum\limits_{A_i^r\in C_i} \sum\limits_{A_j^s\in C_j}PAD(A_i^r, A_j^s)} {|C_i||C_j|},&{\rm else}. \end{cases} $ | (15) |
Meanwhile, the current population
Algorithm 3 Searching among group |
Input: population |
Output: new cluster |
1: Set each |
2: Find the minimum group distance |
3: Calculate new average group distance |
where |
4: Restore the group |
The searching inside group process consists of two operations, SIG1 and SIG2, which are used to reduce the valuable genes in binary string while maintaining it can satisfy all covering requirements. In this process, a parameter
$ \begin{align} m = {\rm round}\left(\frac{m_2-m_1}{G-1}g+\frac{m_1G-m_2}{G-1}\right) \end{align} $ | (17) |
where
Algorithm 4 Searching inside group |
Input: |
Output: |
1: Run this operation for |
2: |
2.1: SIG1: select two members |
After that, put the generating individuals |
2.2: SIG2: Randomly select three members |
If |
2.2.1: Randomly select a valuable gene in |
2.2.2: Randomly select a gene in |
2.2.3: Make the generating individuals |
3: If |
Fig. 3 shows the practical calculation process of searching inside group. In Fig. 3(a), two binary strings,
![]() |
Figure 3 Calculation cases of searching inside group process. |
After the cluster searching process, the scale of current population increases to
Algorithm 5 Cluster selection |
Input: |
Output: |
1: For each |
2: Take member |
3: If |
In this section, we implemented 3 algorithms, a real code genetic algorithm with one-test-at-a-time mechanism GA/OTAT, a binary code GA with global optimization approach GA/BCGO, and CSA, in C++. Meanwhile, 32 CA problems are chosen to test above 3 algorithms and TCA [16], whose source code is implemented in C++, and available online1, on a 2.1 GHz AMD Phenom PC with 2 GB memory.
1https://github.com/leiatpku/TCA
5.1 Experimental SettingsAn individual
1. Initialize test suite TS
2. Initialize combination set CS;
3. While(CS
4. Initialize population of GA/OTAT randomly;
5. While(the termination criteria are not reached)do
6. search a best test case
7. End while;
8. Make
9. Delete the covered combinations in CS;
10. End while;
11. Output TS.
Both GA/BCGO and CSA use (4) to verify the quality of test suite. Besides, based on our experience, we set population size
1. Encode the binary code space;
2. Initialize population of CSA or GA/BCGO;
3. While(the termination criteria are not reached)do
4. search the best genetic string
5. End while;
6. Decode
In the beginning of TCA, the initialization step is called to construct a CA set to cover all valid
We ran GA/OTAT, GA/BCGO, CSA and TCA for 32 CA problems over 30 independent trials. The experiment results of above 4 algorithms are shown in Table Ⅳ and Table Ⅴ with the experiment results of HHH [17] and integer program method [19] for part of testing problems.
![]() |
Table Ⅳ EXPERIMENT STATISTICAL RESULTS OF 3 ALGORITHMS FOR 16 CA PROBLEMS WITH STRENGTH-2 |
![]() |
Table Ⅴ EXPERIMENT STATISTICAL RESULTS OF TEST CASE NUMBER FOR 16 CA PROBLEMS WITH STRENGTH-3 |
In first round of experiments, 16 CA problems with strength-2 have been tested. In the first 8 tests, we set
2http://www.public.asu.edu/~ccolbou/src/tabby/catable.html
In Table Ⅳ, we can see the CSA has gotten 11 best solutions of 16 CA problems and GA/BCGO has gotten 6 best solutions, but GA/OTAT just gets an approximate solution for each 16 problems. In Table Ⅴ, the experiment results show a similar formation as Table Ⅳ. With a holistic view, GA/BCGO can get the optimal solution with a high probability when the problems scale
Comparing the experiment results between GA/OTAT, TCA and HHH, we can see the meta-heuristic algorithms improve the average quality of
On the other hand, comparing the experiment data between CSA and GA/BCGO in Table Ⅳ and Table Ⅴ, we also find the best solution of GA/BCGO is very close to CSA when the
Above all, we believe the proposed global mechanism is a more efficient calculation approach for small-size SUT with multiple covering strength.
5.3 Parameter AnalysisIn this section, we will discuss how to adjust and control the coordination process between the searching among group process and the searching inside group process.
In CSA, the parameter
Fig. 4 is the statistical data of this experiment after 30 independent trials. The boxplot shows the dispersion of 30 experiment results of
![]() |
Figure 4
Experiment statistical data for |
The parameter
Fig. 5 is the statistical data of this experiment after 30 independent trials. Obviously, this parameter has a great influence on the gene sequence reducing process. If
![]() |
Figure 5
Experiment statistical data for |
In this paper, we propose a cluster searching driven combinatorial test data global optimization and generation method. A program based on the proposed method that can be executed on compatible PC has been implemented. The experimental results show the proposed method can get a good performance for small-size CA problems. Within a reasonable time, the optimal test suite can be obtained with higher probability when the scale of complete test case set is less than 2000. But, the quality of its solution declines clearly when the count of test case is more than 4000. So, the proposed method is not ideal for solving the large-size CA problems. Furthermore, we have discussed 2 main control parameters of CSA and given a feasible adjusting approach for them. In future work, we will try to make this approach applicable to MCA problems.
1 |
D. R. Kuhn and M. J. Reilly, "An investigation of the applicability of design of experiments to software testing, "in Proc. 27th Annual NASA Goddard/IEEE Software Engineering Workshop, Greenbelt, MD, USA, 2002, pp. 91-95. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=1199454
|
2 |
D. R. Kuhn, D. R. Wallace, and A. M. Gallo, "Software fault interactions and implications for software testing, "IEEE Trans. Softw. Eng. , vol. 30, no. 6, pp. 418-421, Jun. 2004. http://dl.acm.org/citation.cfm?id=998624
|
3 |
J. F. Wang, C. A. Wei, and Y. L. Sheng, "Locating errors in combinatorial testing using set of possible faulty interactions, "Acta Electron. Sinica, vol. 42, no. 6, pp. 1173-1178, Jun. 2014. http://en.cnki.com.cn/Article_en/CJFDTotal-DZXU201406021.htm
|
4 |
J. Yan and J. Zhang, "Combinatorial testing: Principles and methods, "Chin. J. Softw. , vol. 20, no. 6, pp. 1393-1405, Jun. 2009.
|
5 |
G. Seroussi and N. H. Bshouty, "Vector sets for exhaustive testing of logic circuits, "IEEE Trans. Inform. Theory, vol. 34, no. 3, pp. 513-522, May1988. http://www.ams.org/mathscinet-getitem?mr=959633
|
6 |
J. Yan and J. Zhang, "A backtracking search tool for constructing combinatorial test suites, "J. Syst. Softw. , vol. 81, no. 10, pp. 1681-1693, Oct. 2008. http://dl.acm.org/citation.cfm?id=1401379
|
7 |
C. J. Colbourn, S. S. Martirosyan, G. L. Mullen, D. Shasha, G. B. Sherwood, and J. L. Yucas, "Products of mixed covering arrays of strength two, "J. Combinat. Des. , vol. 14, no. 2, pp. 124-138, Mar. 2006. http://onlinelibrary.wiley.com/doi/10.1002/jcd.20065/pdf
|
8 |
M. Sosina and T. van Tran, "On t-covering arrays, "Des. Codes Cryptogr. , vol. 32, no. 1-3, pp. 323-339, May2004. http://dl.acm.org/citation.cfm?id=993165
|
9 |
F. Kang, S. X. Han, S. Rodrigo, and J. J. Li, "System probabilistic stability analysis of soil slopes using Gaussian process regression with Latin hypercube sampling, "Comput. Geotechn. , vol. 63, pp. 13-25, Jan. 2015. http://www.sciencedirect.com/science/article/pii/S0266352X1400161X
|
10 |
F. Kang, J. S. Li, and J. J. Li, "System reliability analysis of slopes using least squares support vector machines with particle swarm optimization, "Neurocomputing, vol. 209, pp. 46-56, Oct. 2016. http://www.sciencedirect.com/science/article/pii/S0925231216305859
|
11 |
F. Kang, Q. Xu, and J. J. Li, "Slope reliability analysis using surrogate models via new support vector machines with swarm intelligence, "Appl. Math. Model. , vol. 40, no. 11-12, pp. 6105-6120, Jun. 2016. http://www.sciencedirect.com/science/article/pii/S0307904X16300464
|
12 |
F. Kang and J. J. Li, "Artificial bee colony algorithm optimized support vector regression for system reliability analysis of slopes, "J. Comput. Civil Eng., 2016, 30(3): 04015040. DOI:10.1061/(ASCE)CP.1943-5487.0000514 |
13 |
R. J. Zha, D. P. Zhang, C. H. Nie, and B. W. Xu, "Test data generation algorithms of combinatorial testing and comparison based on cross-entropy and particle swarm optimization method, "Chin. J. Comput. , vol. 33, no. 10, pp. 1896-1908, Oct. 2010. http://en.cnki.com.cn/Article_en/CJFDTotal-JSJX201010013.htm
|
14 |
Y. L. Liang and C. H. Nie, "The optimization of configurable genetic algorithm for covering arrays generation, "Chin. J. Comput. , vol. 35, no. 7, pp. 1522-1538, Jul. 2012. http://en.cnki.com.cn/Article_en/CJFDTotal-JSJX201207016.htm
|
15 |
C. H. Nie and J. Jiang, "Optimization of configurable greedy algorithm for covering arrays generation, "Chin. J. Softw. , vol. 24, no. 7, pp. 1469-1483, Jul. 2013. http://en.cnki.com.cn/Article_en/CJFDTotal-RJXB201307006.htm
|
16 |
J. K. Lin, C. Luo, S. W. Cai, K. L. Su, D. Hao, and L. Zhang, "TCA: An efficient two-mode meta-heuristic algorithm for combinatorial test generation (T), "in Proc. 30th IEEE/ACM Int. Conf. Automated Software Engineering (ASE), Lincoln, NE, USA, 2015, pp. 494-505. http://dl.acm.org/citation.cfm?id=2916166
|
17 |
K. Z. Zamli, B. Y. Alkazemi, and G. Kendall, "A tabu search hyper-heuristic strategy for t-way test suite generation, "Appl. Soft Comput. , vol. 44, pp. 57-74, Jul. 2016. http://dl.acm.org/citation.cfm?id=2936563
|
18 |
J. Yan and J. Zhang, "Backtracking algorithms and search heuristics to generate test suites for combinatorial testing, "in Proc. 30th IEEE Annual Int. Computer Software and Applications Conf. , Chicago, IL, USA, 2006, pp. 385-394. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=4020100
|
19 |
A. W. Williams and R. L. Probert, "Formulation of the interaction test coverage problem as an integer program, "in Testing of Communicating Systems XIV, US: Springer, 2002, pp. 283-298. http://link.springer.com/10.1007/978-0-387-35497-2_21
|
20 |
M. B. Cohen, P. B. Gibbons, W. B. Mugridge, and C. J. Colbourn, "Constructing test suites for interaction testing, "in Proc. 25th Int. Conf. Software Engineering, Portland, OR, USA, 2003, pp. 38-48. http://dl.acm.org/citation.cfm?id=776822&CFID=431488427&CFTOKEN=46085464
|
21 |
L. Wang and L. P. Li, "A coevolutionary differential evolution with harmony search for reliability redundancy optimization, "Expert Syst. Appl. , vol. 39, no. 5, pp. 5271-5278, Apr. 2012. http://www.sciencedirect.com/science/article/pii/S0957417411015429
|
22 |
I. Ciornei and E. Kyriakides, "Hybrid ant colony-genetic algorithm (GAAPI) for global continuous optimization, "IEEE Trans. Syst. Man Cybern. -Part B, vol. 42, no. 1, pp. 234-245, Feb. 2012.
|
23 |
M. Chateauneuf and D. L. Kreher, "On the state of Strength-Three Covering Arrays, "J. Combin. Des., 2002, 10(4): 217-238. DOI:10.1002/(ISSN)1520-6610 |
24 |
R. Gelbard, O. Goldman, and I. Spiegler, "Investigating diversity of clustering methods: An empirical comparison, "Data Knowl. Eng. , vol. 63, no. 1, pp. 155-166, Oct. 2007. http://www.sciencedirect.com/science/article/pii/S0169023X07000031
|