Detection of Edges using Two-Way Nested Design

This paper implements a novel approach of identifying edges in images using a two-way nested design. The test comprises of two steps. First step is based on an F-test. The sums of square (SS) of various effects are used to extract the mean square (MS) effect of respective effects and the unknown effect considered as noise. The mean square value has a chisquare distribution. The ratio of two chi-square distributions has an F-distribution. The final decision is based on testing a hypothesis for the presence or absence of an effect. The second step is based on contrast function (CF). This test identifies the presence or absence of an edge in four directions that are horizontal, vertical, and the two diagonal directions. The test is based on Tukey’s T-test. The performance of nested design is compared with the edge detection using Sobel filter. A rigorous testing reveals that the nested design yields comparable results for images that are either free of noise or corrupted with light noise. The nested design however outperforms the Sobel filter in situations where the images are corrupted with heavy noise. Keywords—Analysis of variance (ANOVA); Edge detection; Ftest; nested design; T-test


INTRODUCTION
The detection of edges, in a digital image, has several industrial, biological, medical, scientific, and other real life applications.In a recent paper, the tracking of wild life has been performed by detecting the edges of animals and then keeping their record in a database [1].The FPGA has enabled us implement advanced algorithms that were previously considered impossible due to their longer processing time.Several fast real time edge detection schemes have been demonstrated in [2]- [3].An algorithm of image segmentation using genetic algorithm (GA) has been proposed in [4].A wavelet transform based technique for SAR (synthetic aperture radar) images is given in [5].Several other wavelet transform based solutions are given in [6]- [8].The nonlinear techniques generally outperform linear filters for edge detection.A comparison of several nonlinear techniques, like order statistics filters, hybrid filters, neural filters, and bilateral filters is made in [9].Various statistical approaches for edge detection are demonstrated in [10]- [12].A Kalman-based edge detection scheme is demonstrated in [13].A few advance gradient based edge detection techniques are Marr-Hildreth, and Canny edge detectors [14].The edge detection using cellular neural network (CNN) is given in [15].A combination of ant colony optimization (ACO) and wavelet transform based edge detection technique is given in [16].The linear vector quantization for edge detection has been demonstrated in [17]- [18].
There are generally two distinct approaches followed in digital image processing.The first approach is by using gradient analysis, and the second approach is by using some kind of transform.The gradient analysis identifies an edge with a significant change in pixel value.Some of the earlier gradient operators are Roberts, Prewitt, and Sobel filters [19]- [21].The transform based approach uses discrete cosine transform (DCT), or wavelet transform [21].A significant advantage of the gradient type approach is that their results are based on the local pixel analysis.The wavelet transform considers local effects to some extent, but still the fine details are lost.The second transform technique like DCT completely ignores the local details.All the above approaches fail in case the given image is corrupted with heavy noise.
The mathematical detail of analysis of variance (ANOVA) is available in standard textbooks of statistics [22]- [23].The detection of edges by using Graeco-Latin square (GLS) design involves a template of 5x5 pixels, such that the Greek & Latin letters are assigned to each pixel.The presence or absence of an effect in four directions is tested statistically by testing a hypothesis for each of these letters [24].The contrast function (CF) is also a well-tested statistical approach, where the mean of a set of pixels within the template is statistically compared with the remaining pixels.The approach used Tukey"s T-test for testing the hypothesis of an edge that is present at a particular location [25].The classification of multispectral imaging data is given in [26].The statistical analysis of moving object detection that is previously corrupted with noise is given in [27].In this paper, we have used two distinct techniques comprising of two-way nested design (TND), and a contrast function (CF).Both the approaches help in identifying edges in an image that are previously corrupted with significantly higher degree of Gaussian noise.After a brief introduction, the next section discuses two-way nested design.The mathematical details of analysis of variance (ANOVA) are given in section III.This is followed by the mathematical background of contrast function (CF) in section IV.The significant results and their critical analysis are given in section V. Section VI concludes this paper.

II. TWO-WAY NESTED DESIGN
A two-way nested design comprises of two levels A, and B such that the level-B is nested through level-A.In literature, this is mentioned as B(A).Graphically this is represented as in Fig. 1.The two-way nested design is quite appropriate for spatial image analysis, which identifies small homogenous regions with sufficient regional details.The level-A comprises www.ijacsa.thesai.org of levels where .Level-B comprises of levels, such that .In principle, for each the numbers of elements can vary for each .Though, in this particular situation the value j is same for each i.Further, there is nothing in common for various levels of .Theoretically, the subscript should be writing as , and the nested factor-B should be written as .Instead of this complicated notation, a more friendly notation of is used.The complete analysis is performed on a square mask of pixels.The subsequent mask is taken by scanning the raster from left to right and from top to bottom.The algorithm is extremely fast when the mask positions are non-overlapping.However, this results in missing out several edges.The mask locations can be overlapped that identifies more edges, but also results in larger processing time.The analysis of variance (ANOVA) is applied statistically for identifying and marking regions having considerable gray level changes within a mask.The final decision is made by testing a hypothesis.In case there is significant confidence developed by rejecting the Null hypothesis of either the effect-A or effect-B (alternately, accepting the presence of an edge), then a second test comprising of contrast functions (CF) further identifies edges in four directions: vertical, horizontal, 45 o diagonal, and 135 o diagonal.Only one edge in any one direction is allowed.However, edges in multiple directions within a particular mask are possible.

A. Mask Partition
The partition of mask is given in Fig. 2. The mask of pixels is partitioned into four segments each comprising of pixels.The subscripts in equation do not represent rows and columns as used in standard images.Instead they represent various regions of a mask.Each of these regions has four rows and four columns.Different regions are represented by .The top-left region is considered as first in the effect-A.The segments are marked in clockwise direction starting from the top-left pixel as the first region.

Fig. 2. Mask comprising of pixels
Effect-A (subscript ) compares the effect of four regions of a mask each comprising of pixels.Each of the four regions in level-A are further divided into four equal size subregions each comprising of pixels.The sub-regions is represented by such that .The value of is repeated for each .It is clear that for different values of , there is nothing in common for the same .Each individual pixel in the sub-region is represented by again in clockwise direction starting from top-left pixel.

III. THE ANALYSIS OF VARIANCE (ANOVA)
The gray level change in a large image results in building chipsets that together form interesting features for human and computer analysis.The micro information in the form of pixels is combined to form the macro information in mask comprising of a small set of pixels.The most critical information is, effectively, contained within each pixel.A pixel is designated by where the subscripts correspond to effect-A, effect-B, and an unknown effect considered as noise.All parameters are assumed to have unknown but fixed value with no random value.All randomness is present in the third parameter considered to be a random noise that has Gaussian distribution with zero mean and constant variance.The model is represented by, { Where is the general mean, and are two specific fixed effects with no randomness, and represents Gaussian noise of zero mean and independent variance.The assumption of error having Gaussian distribution with zero mean and independent variance results in a simple mathematical model.Fortunately, this assumption holds true in most of the real images.If zero mean condition is violated, then the pixel values can be recalculated by subtracting the mean value from each pixel generating a new image that has zero mean.The mathematical analysis can then be performed on the new image.In some applications, like texture analysis, a nonzero mean and dependence across various observations may in fact help in the image analysis.The zero mean assumption is considered to hold in all the subsequent analysis.

B. The Least Square Estimate
A nested design identifies the least square estimate (LSE) of various parameters.Matrices are used for simplification.A set of observations are equal to Where the observation matrix y is a column matrix .A 2-D image is easily converted into 1-D column matrix by scanning rows horizontally from top-left to the topright, and then from top-to-bottom of a mask.
is the transpose of which has p rows and columns . is the transformation matrix comprising of equations each having number of parameters. is an unknown parameter matrix of size , and ε is the error of size .The objective is to find the LS estimate ̂ of parameter space .The analysis of variance (ANOVA) is very similar to the regression analysis.The only difference is that in regression www.ijacsa.thesai.organalysis, there is no restriction on the elements of as these can be integers or real numbers; whereas, ANOVA requires the elements of to be strictly zero or one.The model essentially assumes that a particular effect is either present or absent.This assumption simplifies the mathematical derivation, and result in an efficient and fast processing.The sum of square of error, ε is (3) By setting to zero and then solving for β, the LS estimate of parameter matrix ̂ is found.The objective is to find their estimated values ̂.In case is a square matrix with full rank, then the estimated value ̂ is given by, ̂ If however is not a square matrix, then this is converted into a square matrix by multiplying both sides by and then solving for estimator matrix ̂.

̂
(5) If is not a square matrix or it is not having full rank, then a set of side conditions are added to make it a full rank matrix.The estimates are then found by The set of side conditions must satisfy, ̂ The above general mathematical analysis is applied to the nested design comprising of effect-A and effect-B.

C. Sum of Square (SS) of Various Effects
Under the assumption Ω, an observation is approximated by , where is the sum of the general mean µ, and the various effects (level-A), and (level-B).The error is equal to .The subscript "k" considers the unaccounted for effects and includes all pixels in a mask.The sum of square of error (SSE) is equal to The least square estimate of mean µ is found by differentiating SSE with respect to µ and then setting it equal to zero The summation is taken over all possible values of subscripts i, j, and k.By replacing and then solving for the estimate of µ.The estimated value of mean ̂ is Where is the total number of observations.The dot notation helps in simplifying an otherwise complex equation.Throughout the paper, the summation is taken across all parts, that is , and , and .The sum of square (SS) of level-A (α i ), the level-B (β ij ), the sum of square of error (SSE), and the total SS (SST) are taken from [22].The various parameters are, The , , and correspond to the number of pixels at various partitions of mask.
The degrees of freedoms (df) are given in Table II.
The mean square (MS) of each effect is found by dividing the sum of square (SS) of an effect with the respective degree of freedom.The MS value with a degree of freedom has a Chi-square distribution with degree of freedom.This is represented by .The ratio of two Chi-square distributions with the respective degrees of freedom and gives Fdistribution, that is F-test = . The tables of F-test for various degrees of freedom are given in standard textbooks of www.ijacsa.thesai.orgstatistics [22]- [23].The MS value of various effects is given in Table III.The respective F-tests are given Table IV.Under the Ω-assumption, the presence of significant effect of a factor is confirmed by testing the hypothesis against the Null hypothesis as Similar hypothesis is tested for effect-B by testing hypothesis .In case either the effect-A, or effect-B are found to be present, then the next step is to find the exact location of an effect as derived in the contrast function discussed in next section.

IV. THE CONTRAST FUNCTION (CF)
A contrast function is applied in case the Null hypothesis of either effect-A, or effect-B is rejected against the alternate hypothesis.The primary objective is to identify if there is a significant variation in horizontal, in the vertical, or in the two diagonal (45 o and 135 o ) directions.
Definition: A contrast among a set of parameters, is a linear function of the , ∑ with known constant coefficient such that the condition ∑ holds.
As per above definition, the difference of two rows, are form a valid contrast function.Similarly, a combination of rows with an appropriately selected coefficients form a valid contrast function.Other useful contrast functions can be formed in the vertical, and in the diagonal directions.The Gauss-Markov Theorem helps in finding the least square (LS) estimates.

Gauss-Markov Theorem: Under the assumption Ω: if
∑ then every estimable function has a unique unbiased linear estimate ̂ which has minimum variance in the class of all unbiased linear estimates.
The estimate may be found by ∑ by replacing the { } with any set of LS estimates { ̂ ̂ ̂ }.
is the transpose of coefficient matrix consisting of zeros and ones.The matrix is considered to have a full rank.The column matrix represents the parameter matrix.The matrix ∑ represents the covariance of observation matrix y which is assumed to be independent.All elements of this covariance matrix are zero, except the diagonal elements which are constant with the value equal to the variance .The matrix is the transpose of a column matrix , which is a coefficient matrix fulfilling the requirement ∑ .
A least square (LS) estimate of observations is found by taking the sample mean of observations in four directions.These are horizontal, vertical, diagonal 45 o , and diagonal 135 o .The sample mean in horizontal direction is found by summing pixels of a row across all columns.This is represented by ∑ .Similarly the sample mean in the vertical direction is found by ∑ .The LS estimates can be found by summing appropriate pixels in the diagonal 45 o , and diagonal 135 o directions.Using the Gauss-Markov theorem, an unbiased estimate of contrast function ̂ in the horizontal direction is, The variance of ̂ is found by is the number of observations of each column to find the LS estimate.
is the variance with constant value of all observations.The has an unbiased estimate that is equal to the mean square error (MSE) such that ̂ .The MSE is found by dividing the sum of square of error (SSE) with the respective degree of freedom.The estimate of contrast function is found by, The objective is to test the hypothesis, which tests significant variation across { }, There are generally two methods for multiple comparisons of estimated values.These are Scheffe"s S-method, and the Tukey"s T-method.The T-method is preferred for pair-wise comparison, and the confidence interval is narrower than Smethod.The S-method is applicable to all other types of comparisons.Here, the T-method is used as only the pair-wise comparison is required.Given the gray levels of two set of pixels as and , the confidence interval of the parameter ) is found by using the The ̂ ̂ ̂ ) represents estimate of .The unbiased estimate s 2 of variance 2 has degree of freedom and this is independent of samples.The ratio is the Studentized range given by ⁄ .The distribution of has been tabulated for various values of and in several standard textbooks of statistics.For reference see Table A-9 in [22].An upper -level of confidence interval corresponds to percentile level.As an example an upper level of confidence interval corresponds to a percentile of 95%.The test confirms presence of an edge, if the above confidence interval does not include zero value; that is either the entire range is positive or the entire range is negative.

V. SIMULATION RESULTS
An overview of various steps is presented in Fig. 3.The algorithm considers a mask of 8x8 pixels.This mask size is selected to have four equal partitions, each 4x4 pixels.Each of these 4x4 pixels is further partitioned into four equal partitions, each 2x2 pixels.The processing is initiated from the top-left corner of an image, and scanned throughout from left to right and from top to bottom.A two-way nested design is applied on this mask.This generates two thresholds f_a and f_b.The threshold f_a signifies that there is enough variability among the four quarters of a mask each comprising of pixels.The threshold f_b signifies that there is enough variability within each quarter of a mask.These thresholds are compared with the values from tables given in standard textbooks [22] using ⁄ , and ⁄ .If any of these inequalities do not hold then it is considered that the variability across four quarters of a mask, and the variability within each quarter is not significant.This is deduced in accepting the Null hypothesis of no significant variation at two granular levels.The mask is moved to the next adjacent location.In case the f_a or f_b is greater than the threshold, then the Null hypothesis is rejected, against the alternate.This demonstrates that there is enough variability within the mask and may contain an edge.The mask needs to be subjected to further analysis.The next step involves in testing the mask for the presence of an edge in four directions using contrast functions.This step uses Tukey"s T-test to mark edges in any of the four directions that are horizontal, vertical, 45 degree diagonal, and 135 degree diagonal.An edge in the horizontal direction can be present anywhere between 1 st and 8 th row of a mask.Several contrast functions are therefore generated, and the highest of them is compared with the threshold for testing the hypothesis for presence of an edge.The location of an edge is marked at the specific location with the largest disparity level.Similarly, the location of an edge in the vertical direction is marked at the highest contrast location.This results in exact identifying the most appropriate location of an edge.The edges in diagonal directions are simply marked on the respective diagonals of mask.In case the test fails to identify www.ijacsa.thesai.organ edge in any of the four directions then the next mask is selected.Only one edge is marked at a particular mask location, in a particular direction.The mask is moved left-toright and from top-to-bottom to scan the whole image.

A. Nested Design
The formulae for sum of square of effect-A (SSA), the sums of square of effect-B (SSB), and the sums of square of error (SSE) are given in Table 1.The aggregate of three sums of squares are always equal to the total sums of squares (SST).The respective mean square of effect-A (MSA), mean square of effect-B (MSB), and the mean square of error (MSE) are found by dividing the respective sums of squares with their corresponding degrees of freedom (d.f.) as given in Table 2 and Table 3.The mean square (MS) of an effect with a degree of freedom has chi-square distribution with a degree of freedom.This is represented by .The ratio of two chisquare distributions is represented by an F-test.The F-test for effect-A is measured by which has chisquare ⁄ distribution; where , and are the degrees of freedoms of MSA and MSE, respectively.These degrees of freedoms are respectively equal to , and ∑ ∑ which are equal to 3, and 4(4)(3) = 48 respectively.The F-test for effect-B is measured by , which is again a chi-square distribution, ⁄ with a degrees of freedom and .The corresponding values are equals ∑ , and ∑ ∑ , respectively.These are correspondingly equal to 4(3) = 12, and 4(4)(3) = 48.The details are given in Table II.

B. The Contrast Function
The contrast function is applied in four directions as in Fig. 4. The marked and unmarked pixels are represented by y m , and y um , respectively.The four contrast functions are formed as, The and corresponds to marked set of pixels and unmarked set of pixels.The corresponds to four directions as given in Fig. 4. The ̂ ̂ ̂ is the corresponding estimated value, and the ̅ ̅ represents pixel sample mean in marked and unmarked mask area.The total number of pixels in an pixel mask is 64.The total mean square of contrast function is partitioned into mean square of treatment , and the mean square of error, . The corresponding degrees of freedom are respectively equal to 3, 12, and 63.Using the Table A-9 in [22], the threshold is taken as 4.31 for , and for an upper 0.1 level of confidence interval corresponding to a percentile of 95%.The value of is taken as the closest value to the required value available in the Table A-9.

C. Discussion
The simulation results are given in Fig. 5 and Fig. 6.Fig. 5(a) gives a set of five test images consisting of Lena, house, chilies, cameraman, and baboon.Fig. 5(b) reproduces the images of Fig. 5(a) with an additive Gaussian noise of N(0, 400).The edges of the original image are detected using Sobel filter in Fig. 5(c).The mask size is a standard 3x3 pixels.Fig. 5(d) gives edges detected by nested design using an pixel mask.The pixels of the mask are tested for the presence of level-A, and level-B effects.In case the hypothesis is affirmative then a follow-up contrast test are performed in above four directions.In order to be consistent with Sobel filter, the mask is shifted at every 3 pixels.This results in considerable overlap, but gives much improved results that are compared with those of Sobel filter.The comparison of Fig. 5(c) and Fig. 5(d) reveals that both the approaches exhibit  A comparison of numerical result is performed by comparing peak-signal-to-noise (PSNR) ratio, Where, M, N are the number of pixels in horizontal and the vertical directions.
, and ̂ are respectively the original, and the estimated image.www.ijacsa.thesai.orgA second criterion is the percentage of pixels marked as edges.The following simple formula is used for this measure, (21)

VI. CONCLUSIONS
This paper presents a novel approach based on a nested design that marks the edges in an image.The complete design comprises of two steps.The initial test is based on two-way nested design which tests the variability of pixels in a mask.The decision is made by testing a hypothesis using an F-test.The variability of pixels is statistically tested to find if there is sufficient variability at two granular levels.The mask is subjected to a second test if there is sufficient confidence of enough variability.
The second test is based on contrast function (CF) using Tukey"s T-test.The test identifies edges in four directions that are horizontal, vertical, and the two diagonals.The contrast function tests for the maximum contrast value at one of the several other possible locations.This selection is based on identifying the best location for marking an edge.In the diagonal directions, however, only the location at the middle of the mask is marked as an edge.The results are compared with edge detection using Sobel filter.A rigorous testing reveals that both the nested design and Sobel filter yields comparable results for noise-free images.The nested design, however, out performs the Sobel filter in situations where the image is corrupted with heavy Gaussian noise.It is clear that the nested design requires more processing than the Sobel filter.The processing time can be significantly reduced by parallel processing using advanced hardware like FPGA, and VLSI.

Fig. 5 .Fig. 6 .
Fig. 5. (a) Original Images of Lena, House, Peppers, Cameraman, Baboon.(b) Images with additive noise of N(0, 400).Edge detection of noise free images using (c) Sobel filter (d) nested design comparable results in identifying the edges.The nested design has performed slightly better in terms of few additional edges marked than the Sobel filter.The edge detection with a moderate Gaussian noise of N(0, 25) for Sobel and nested design is given in Fig. 6(a), and Fig. 6(b).A comparison clearly explains that the Sobel filter is able

TABLE . V
. PEAK SIGNAL-TO-NOISE RATIO, AND NUMBER OF EDGES IN PERCENTAGE OF PIXELS