A New Algorithm for Post-Processing Covering Arrays

Software testing is a critical component of modern software development. For this reason, it has been one of the most active research topics for several years, resulting in many different algorithms, methodologies and tools. Combinatorial testing is one of the most important testing strategies. The test generation problem for combinatorial testing can be modeled as constructing a matrix which has certain properties, typically this matrix is a covering array. The construction of covering arrays with the fewest rows remains a challenging problem. This paper proposes a post-processing technique that repeatedly adjusts the covering array in an attempt to reduce its number of rows. In the experiment, 85 covering arrays, created by a state-of-theart algorithm, were subject to the reduction process. The results report a reduction in the size of 28 covering arrays (∼33%). Keywords—Software testing; Combinatorial testing; Covering arrays; Post-Processing


I. INTRODUCTION
The ever increasing complexity, ubiquity, and dynamism of modern software systems demands new approaches to quality assurance.Extensive testing is required to assure that software works correctly, however, in many practical applications the number of configurable parameters may be large, and testing all possible configurations is not possible due to limited testing resources.Combinatorial testing enables the tester to execute a small set of test cases on the system, while achieving very high fault coverage.The pairwise test is one of main approaches in black-box testing.Several studies have demonstrated the effectiveness of pairwise testing [1], [2].By examining fault reports for several systems [3] shown that ∼100% of faults can be discovered with 4-wise to 6-wise interactions.
The first step to apply combinatorial testing is to construct a parametrized model of the System Under Test (SUT).The tester should first identify the input parameters related to the test goal, i.e. parameters affecting the system behavior; they may include but not limited to the following: (a) parameters of method calls; (b) parameters in system settings; and (c) a selection of replaceable system components installed in a test environment, such as hardware devices, system libraries and applications [4].
The key idea of combinatorial testing is that most of the SUT faults can be detected by combinations of a small number of factors.In combinatorial testing, a covering array (CA) is usually used as test suite, which covers parameter combinations involving t factors.
Covering Arrays (CA) are one of the most popular methods for representing pseudo-exhaustive test suites, they are small in comparison with an exhaustive approach but guarantee a level of interaction coverage among the parameters involved.They focus on having minimum cardinality (i.e.minimize the number of test cases), and maximum coverage (i.e. they guarantee to cover all combinations of certain size between the input parameters).To address this problem it has been proposed several methods (e.g., algebraic, exact, greedy and metaheuristic); however, usually they produce quasi-optimal covering arrays that contain combinations of symbols which are covered more than once (redundant).Redundancy opens the possibility for designing post-processing algorithms that eliminate the redundant information in the existing covering arrays with the aim of improve them.This paper presents a new algorithm called Post-Procesing Covering Arrays (PPCA) for eliminating redundant tests; it receives a covering array as input, then it tries to reduce the number of tests (rows).
The remainder of this paper is organized in four more sections.Section II, presents a brief overview of the principal techniques and tools for constructing covering arrays; Section III presents the new algorithm for post-processing covering arrays by deleting unnecessary tests.Section IV, shows the complete results for post-processing a benchmark composed by 85 covering arrays.Final remarks are presented in the section V.

II. RELATED WORK
There are several methods for constructing covering arrays; according to the strategy for generating covering arrays, they can be classified into algebraic, exact, greedy and metaheuristic approaches.Additionally, there are some useful operations that can be applied to a covering array previously constructed.
Algebraic approaches use formulas or operations with mathematical objects such as cyclic vectors [5], permutation vectors (Zero-sum method [6]), groups [7], cover starter [8] or covering arrays with small values of t, k, v (doubling [9] and v-plication [10] Algebraic constructions often provide a better bound in less computational time, but impose serious restrictions on the system configurations to which they can be applied.For example, many approaches for constructing covering arrays require that the domain size be a prime number www.ijacsa.thesai.orgor a power of a prime number; this significantly limits the applicability of algebraic approaches for testing. Greedy approaches are more flexible than algebraic constructions.These methods can generate any covering array using as input t, k, and v.The majority of commercial and open source test data generating tools use greedy approaches for covering arrays construction (TVG [11], ACTS [12], Jenny [13] and T tuples tool [14]).The problem with these approaches are the quality of results -greedy methods rarely obtain optimal covering arrays-.
The exact approaches are exhaustive methods for the construction of optimal covering arrays.Despite of the fact that some approaches have techniques for accelerating the search process, in general they require an exponential time for completing the task, making them only practical for constructing small covering arrays.Some examples of this type of construction were reported in [15], [16], [17].
Metaheuristic approaches do not guarantee the construction of the optimal covering array but in practice they give good results in a reasonable amount of time.Among the most used metaheuristics are simulated annealing [18], tabu search [19], [20] and genetic algorithms [21].

III. METHODOLOGY
This section presents an algorithm for post-processing covering arrays; it starts with some basic definitions that introduce the problem, and then the proposed algorithm is described.

A. Definitions and Preliminaries
Definition 1: Let N , t, k, and v be positive integers where t ≤ k.A covering array CA(N ; t, k, v) is a matrix of size N × k and strength t where each column has entries from alphabet Σ of size v.In every N × t subarray, all possible v t t-tuples of symbols occurs at least once.Then N is the number of rows, t is the strength of the coverage of interactions, k is the number of factors (also called the degree), and v is the number of symbols for each factor (also called the order).
Definition 2: A t-way interaction is the assignment of specific values to each factor from set of t factors.The array is 'covering' in the sense that every t-way interaction is represented by at least one experimental run.In any covering array, the number of N × t subarrays is M = k t , and the number of t-way interactions to be covered is k t v t .Definition 3: The covering array number CAN(t, k, v) is the smallest N for which a CA(N ; t, k, v) exists.The CAN is defined according to When a covering array is used as test suite: • Each column represents a parameter of the software under testing (SUT).
• The symbols in the column specify the values for such parameter.
• Each row represents a test case to be performed.• The fundamental problem is to determine When a covering array is constructed (see section II), it can contain t-way interactions which are covered more than once (in the definition of a covering array, the indication at least once means that a combination of symbols can be covered more than once).This fact opens the possibility that some symbols in certain positions are redundant and can be changed for any value without affecting the coverage of a CA, these symbols are referred to as redundant.To illustrate the existence of redundant rows, consider the example provided in Fig. 1.If the last row is deleted from the CA(10; 2, 4, 3) shown in Fig. 1(a) then the matrix shown in Fig. 1(b) is obtained which is still a covering array because all 2-combinations of symbols are present.Hence, the last row is redundant and can be deleted from the original matrix; then CA(9; 2, 4, 3) is better than the original one.

B. Proposed approach
Let R be the set of possible realizations (t-tuples of Σ), and I = (I j ) M j=1 be the vector of interactions (t-tuples of columns).The i-th row test r i can be represented by a vector S i of the form where the t-way interaction s ij = (I j , v ij ) associates the interaction I j to its realization v ij ∈ R in the i-th test.
Elements of S i can be used for building an index M that maps t-way interactions to lists of row tests that cover each interaction.That is, where e o ∈ I × R, and L(e o ) is the list of rows that test e 0 .
For obtaining the reduced covering array, the lists of rows in M are iteratively modified by removing elements.Given a map M, the vector of cardinality S # i of row test r i is defined as where and #L(s ij ) is the size of the list L(s ij ).
For deciding which rows are included in the reduced covering array, the vector S # i is sorted in ascending order This array is an indicator of how a row test is required; when the first element of S s i is one it means that the row is strictly required.If all elements are set to N +1, then the i-th row test is unnecessary.Hence, the order of elements in S s i is important; then, for obtaining a reduced covering array vectors S s i of all the rows are compared; the first row in the lexicographic order -i.e., first unequal elements determine the order-is selected as the best row test in each iteration; the selected row test is included in the reduced covering array

C. Algorithm
Procedure PPCA(C) shown in the algorithm 1 illustrates the proposed approach.The input C is a covering array of size N ×k and the algorithm produces a reduced covering array C .The set L is used to store the row tests of C that are included in C , initially L is set to empty.At line 3, the algorithm creates a map M by analyzing each row test as stated in (2).After that, the algorithm iterates the following steps while the map M has entries, i.e. keys(M) = ∅: (a) Select the index i m such that S # im is the smaller according to the lexicographic order (step 5), (b) Remove entries of M that include i m (step 6), and (c) Add i m to the set L (step 7).Finally, C is obtained by selecting rows L from C (step 9).Hence, the number of rows of the resulting covering array, N ≤ N , is equal to the size of L.

D. Example
The toy example shown in Fig. 2 is used for clarifying algorithm 1, the matrix C of size 9 × 3 illustrates the test cases; but some of them are redundant.For obtaining the reduced covering array C the PPCA algorithm proceeds as is illustrated in Figure 3 and described in the following:

INITIALIZATION:
After creating the initial map M from C, and obtaining S s i |i = 0, . . ., 8; the row i = 2 is selected for the first iteration because it is the smaller according to the lexicographic order; i.e. S s 2 [1, 2, 2] is selected because row 2 is strictly required for covering (c 0 c 2 , 01).ITERATION 1: After inserting row 2 into the reduced covering array, and updating the vectors S s i for i = {1, 6}; the row i = 4 is selected for the next iteration.ITERATIONS 2,3: Row s 3 and 0 were included in C .Note that if one of the rows 1, 6 or 8 were selected in iteration 3 (by a function other than the proposed), the covering array C must include one or more additional rows for the complete covering.But, by using ( 5) the optimal solution can be found because row 0 completes the covering array.ITERATION 4: The algorithm finishes because keys(M) = ∅ and the resulting selection L = {2, 4, 3, 0} are the row tests included in the reduced covering array.
It is easy to show that all combinations of t = 2 are included in the matrix C that only includes rows {2, 4, 3, 0} of C.

IV. RESULTS AND DISCUSSION
This section presents an experimental design and results derived from the methodology described in the previous section.An experiment consisting of 85 covering arrays was designed, each covering array was built using a tool called IPOG (one   of the most popular tools in the state-of-the-art of covering arrays construction).
The results derived from our experiment are shown in table I.In this analysis, binary covering arrays are grouped by the number of their columns and their strength.Every group of t contains the different values of the alphabet for each covering array.Every cell of the this table shows the number of rows reduced in the corresponding binary covering array.As seen in the last row, the results reported a reduction in the size of 28 covering arrays (∼33%).
Section II summarizes the techniques for constructing covering arrays, they can be grouped into: algebraic, greedy, exact and metaheuristics techniques.The best known solutions for CA with t = 2, 3, . . ., 6 are publicly available [8].By analyzing that results, one can see that metaheuristics techniques produce better bounds but they are computationally expensive.For this reason, these techniques have concentrated on the construction of CA with k < 100.Algebraic and greedy techniques are better suited for large covering arrays, i.e. v > 3, k > 100 and t > 3; therefore, PPCA algorithm can be used for post-processing solutions constructed by these heuristics.

V. CONCLUDING REMARKS AND FUTURE WORK
This paper presents a post-processing strategy, called PPCA, for reducing the size of a covering array.The postprocessing reduces the number of rows of a covering array through iteratively including the best row in the reduced covering array -the row that is most important for guaranteeing covering-.In some cases, the reduced covering array could be optimized but here we are interested just in reducing the size of a previously constructed CA, not in building a new one.
A dataset of 85 covering arrays constructed by the state-ofthe-art algorithm IPOG was used to test the PPCA algorithm.The results show a reduction in ∼33% of the instances.www.ijacsa.thesai.org In conclusion, PPCA has already proved being effective for reducing a wide variety of covering arrays.
We are designing a parallel version of the PPCA algorithm, in order to address problems with high strength, many factors or rows.

Algorithm 1 A 6 :
Post-Processing covering array algorithm (PPCA).Require: A covering array, C, of size N × k Ensure: A reduced covering array, C , of size N × k with N ≤ N 1: procedure PPCA(C) keys(M) ← keys(M) \ S im 7: L ← L ∪ {i m } 8: end while 9: C ← Select rows L from C 10: return C 11: end procedure

0 c 1 c 0 c 2 c 1 c 2 C 2 i c 0 c 1 c 2 1Fig. 2 :
Fig. 2: Left: A covering array with k = 3, t = 2, and Σ = {0, 1}.Right: an illustration of the t-way interactions of row tests used for generating the map M that relates realizations R to interactions I.

Fig. 3 :
Fig.3: PPCA for reducing the covering array instance shown in Fig.2, the reduced covering array C is obtained by selecting the rows {2, 4, 3, 0} from C.

TABLE I :
Results of post-processing binary covering arrays, with 2 ≤ t ≤ 6 and k ≤ 50.The number in each entry is the value N − N for the instance with values k, t.