Frotema: Fast and Robust Template Matching

—Template matching is one of the most basic techniques in computer vision, where the algorithm should search for a template image T in an image to analyze I. This paper considers the rotation, scale, brightness and contrast invariant grayscale template matching problem. The proposed algorithm uses a sufficient condition for distinguishing between candidate matching positions and other positions that cannot provide a better degree of match with respect to the current best candidate. Such condition is used to significantly accelerate the search process by skipping unsuitable search locations without sacrificing exhaustive accuracy. Our proposed algorithm is compared with eight existing state-of-the-art techniques. Theoretical analysis and experiments on eight image datasets show that the proposed simple algorithm can maintain exhaustive accuracy while providing a significant speedup.


INTRODUCTION
Template matching is the task of seeking a given template in a given image.It is also known as pattern matching [1] and can be considered as one of the most basic operations in computer vision.Template matching is heavily used in signal, image and video processing.A lot of applications are based on template matching such as image denoising [2][3][4], motion estimation [5], and emotion recognition [6] to name a few.The literature on template matching contains a variety of algorithms.Based on the search accuracy, these algorithms can be broadly divided into two categories: approximate accuracy and exhaustive accuracy algorithms.The first category can achieve fast speedup at the cost of some loss of accuracy and often depends on one or more approximations, while exhaustive accuracy algorithms obtain fast speedup without losing accuracy.This category includes domain transformation techniques using FFT and bound based computation elimination algorithms in which inappropriate search locations are skipped from computations.
In general, regardless of the accuracy required, there are several ways to tackle the problem.Some techniques use the normalized cross-correlation (NCC) to handle brightness/contrast-invariant template matching.This can be made faster using bounded partial correlation [7,8] or integral images [9].Some other techniques depend on scale and rotation invariant key points [10], while some rely on previous segmentation/binarization of the image to analyze [11,12], FFT [13], partial elimination [14], correlation transitivity [15], histogram [16,17], circular and radial projections [18,19], or auto-correlation [20].However, in many cases, these algorithms cannot be used due to some image characteristics such as little grayscale variations and the resistance of the image to be binarized efficiently, or because they can't handle the rotation, scale, brightness and contrast (RSBC) invariant template matching problem, which is critical to several applications, or because they are unacceptably slow for many real world scenarios.
The tremendous amount of time-sensitive applications of template matching always pushes the need for faster techniques.So in this paper, we present a fast algorithm to solve RSBC invariant grayscale template matching problem.Our algorithm, named FRoTeMa (Fast and Robust Template Matching), is also rotation/scale-discriminating within a predefined level of accuracy, i.e., the method determines the scale and rotation angle of the template within the matching process.Although we argue on maintaining exhaustive accuracy only when both the image and the template are equivalent, experiments on eight image datasets show that the proposed simple algorithm can provide exhaustive-like search when some deteriorating conditions such as blurring or JPEG compression are applied on query images.We have compared our algorithm with eight state-of-the-art techniques.Experimental results reveal that, although its simplicity, the proposed algorithm also shows a significant speed-up.
The rest of the paper is organized as follows: Section II introduces the proposed template matching algorithm.Section III presents the complexity analysis of the new algorithm.In Section IV the experimental results are presented.Finally, conclusions are summarized in Section V.

II. THE PROPOSED FROTEMA ALGORITHM
FRoTeMa depends on bound comparisons to achieve fast performance.Instead of checking the similarity between each pixel in the template with a corresponding pixel in each candidate block in the image to analyze, FRoTeMa uses sufficient condition to distinguish between candidate matching positions and other positions that cannot provide a better degree of match with respect to the current best-matching one.
To describe the condition used, Let be the image to analyze, the template, a block in with the same size of , a sub-template block of size pixels, the summation of pixel intensities within , the corresponding sub-block in and the summation of pixel intensities within .We consider and to be equivalent under brightness/contrast variation if there is a contrast www.ijacsa.thesai.orgcorrection factor α > 0 and brightness correction factor β such that: (1) where is the matrix of ones [18].Similarly, we have: (2) Using (2) we conclude: (3) If we segment into smaller blocks similar to , and form a new vector to store summation of pixel intensities of each one of these blocks and then do the same for to form a similar vector , we can use (3) to write: (4) Eq. ( 4) represents the relation between and which is similar to the relation between and in Eq. ( 1).For a correct match block in , and because α, β, and are constants, the normalized versions of and to zero mean and unit magnitude (denoted as and respectively) will be equivalent.Using a threshold to establish a minimum degree of similarity required to accept a match, we can write: (5) Inequality ( 5) provides the condition used in the algorithm.That is, for each block in , if the condition is satisfied, this means that the block is a candidate to be a correct match, while if the condition is not reached; the block will be skipped without a need for further consideration.Using such condition not only can accelerate the search for the correct match, but also the usage of vector normalization will make the condition able to handle the possible variation in brightness/contrast between the template and candidate blocks.

A. Template Manipulation
Many template matching algorithms have to scan the image to analyze several times to be able to support rotation-scale invariance and this procedure may take much time to compute.FRoTeMa, Instead of doing so, relies on the template to handle the task.First, a prespecified range of scales and orientations should be set.The orientation and scale of the original template are changed according to each combination of angles and scales within the predefined ranges.
We have used the OpenCV [21] library for such rotation and scale operations.Then, a squared area in the middle of the resulting template is chosen and split into a grid of equallyspaced non-overlapping squared blocks that have a predefined side length.
is used as one of the algorithm parameters to specify such side length.The grid layout is utilized to make use of ( 5) while making the implementation more simple.This grid can be split into different number of blocks.If the template is exactly the same as a part of the input image, only one block is sufficient to use (5), while under brightness/contrast variation, we have to use more blocks to be able to apply vector normalization in order to remove the brightness/contrast difference.In our experiments, we have used a grid of nine blocks.The centered block in such grid is called as the head block (Fig. 1).The summation of pixel intensities within each block of the grid is calculated, and then these new nine values are normalized to zero mean and unit magnitude.After this procedure ends, a vector of nine normalized values for each angle-scale combination should be calculated.Notice that the value must be the same for all calculated vectors.These vectors will be stored in a new vector .At the end of this computation, the resulting vector is sorted with respect to the calculated values of head blocks.Fig. 1.A sample of 21×21 pixels template.FRoTeMa uses 9 blocks similar to big ones that appear in the middle.In this example, Ƞ_b is 5 pixels and head block is the one in the middle with a thicker edge

B. Image Scan
After computing , the image to analyze is scanned sequentially in search of the template.Every pixel in is checked only once by specifying nine squared blocks grid around it, calculating the summation of pixel intensities in such blocks and normalizing the resulting values to zero mean and unit magnitude.A new vector is formed to always store the current pixel values.Similarly, the centered block of each pixel grid will be called as the head one.Notice that value should be fixed for the template and grids.To accelerate the process, the grid of the left-most pixel in each row of is computed.Then this grid is used to calculate its neighbor pixel's grid by an overlapping process using a sliding window and such procedure is continued with the same criterion until reaching the right-most pixel.
In the perfect case, when the change between the template and the matching block (if any) is only in brightness, contrast and/or scale-orientation within the prespecified ranges, the vector at one or more pixels will be identical with at least one of the vectors of .In such case, only pixels which have a vector that is identical to one or more vectors will be interesting for further investigation, while in the common case (as seen in many real world scenarios), when some deteriorating conditions such as blurring or JPEG compression are applied on query images, the vector may not be equivalent to any vector.So, a thresholding technique is used in this case to handle such possible gap.Two thresholds and are used.sets the limit for the absolute difference between the current pixel's head block value and head blocks' values.If one or more vectors pass such condition, the remaining eight values in are checked against www.ijacsa.thesai.org the corresponding eight values in the promising vector(s).is used in such case to set the limit for the absolute difference between these values.Head blocks are handled uniquely as they should be the least distorted ones if the template is scaled or rotated with a value that is not covered in the prespecified ranges.While we can merge and values, condition can prune the majority of the vectors without a need for further processing.We call the overall of promising vectors at all candidate pixels as candidate states.The amount of such states is important as it has a direct relation with speed of computation.

C. Candidate Pixels Handling
Each candidate pixel is in the center of a potential match block in , and all of them have, by definition, one or more promising vectors to be checked.When the current pixel is considered as a candidate one, the following is done with each one of its promising vectors: a new version of the original template is formed using the angle and scale that are associated with the vector.Before checking the matching between and , the possible variation in brightness and/or contrast between them should be handled.Several techniques may be used to handle this variation such as correlation coefficient and normalization.These techniques are efficient but they have expensive computational complexity.To avoid such drawback, another procedure is used based on (1) to estimate a contrast correction factor and a brightness correction factor .If such estimation relied on a few pixels, it will be sensitive to possible noise.So, a squared area at the center of is chosen and intensity of pixels is calculated after subtracting the mean value of them from each pixel to form .A similar process is done for a corresponding squared area in to form (We found experimentally that and of 11×11 pixels is far enough).This is made to cancel the possible variation in brightness between and .To estimate , a new matrix is calculated as = / , and is considered to be the median value of .can be used to estimate by using another matrix = -( ).The median value of can be considered as an estimation of .
To check whether is a correct match block, the sum of absolute difference (SAD) is used between and using the following formula: (6) The value is calculated within the area of the largest circle that can be fit in the middle of as such circled area will always be within the template if it is rotated.The ratio between the value and the sum of pixel intensities within the mentioned circled area is calculated for each candidate state in search of the minimum ratio.If the minimum ratio is below some threshold , the template is considered to be detected at the current pixel.The algorithm has been applied successfully on rescaled versions of both of the image to analyze and the template to be a one fourth of their original sizes.i.e., a one level of pyramidal reduction has been used.We didn't have to apply the algorithm on the original scale to get accurate results except in a few cases.

III. COMPLEXITY ANALYSIS
Let be the number of pixels in the image to analyze , the number of pixels in the template image, the number of candidate states.the number of angles and the number of scales.Practically, we can consider that and have a complexity of O(√ ) and is, by definition of O(√ ).As is usually much larger than , , and , all operations that do not depend on or are neglected.At the image scan step, the operations used to prepare each nine-blocks grid with the help of the sliding window have a complexity of O( ) or O( √ ), while the operations used to calculate have a complexity of O( ).The computation used to match with is of O( ) or O( ) as we depend on binary search technique when searching for promising vectors in .Also, the algorithm uses O( ) computation to handle the candidate pixels.Thus, the overall complexity of the algorithm is O( √ ).As will be clarified in the experiments, the proposed algorithm achieves a great reduction in C from millions or hundreds of thousands to just hundreds or tens.Thus, the overall complexity can be reduced to O( √ ) as opposed to, for example, O( ) in [19].

A. Experiments
To check FRoTeMa performance, several experiments have been executed using eight datasets.All experiments have been run on a Core i-5 (2.3-GHz PC) with 4 GB of RAM.To check the proposed algorithm robustness to simultaneous variations in rotation, scale, brightness and contrast, a dataset (K1) provided by Kim [22] has been used.It consists of six images and twelve templates.Each one of the six images includes all of the templates.This dataset tests only the rotation invariance.So, for each query, the scale, brightness and contrast of the queried image have been changed randomly within the ranges of 0.7 to 1.4, -25 to +25 and 0.85 to 1.15 of the original values respectively and each randomly changed query has been tested twice.
Comparison with Forapro [18] and Ciratefi [19] was against five datasets.To test and compare robustness to rotation and scale variations, we have used another dataset (K2) provided by H. Kim [22].It consists of eight images (400×400 pixels) and nine templates (71×71 pixels).Each one of the eight images includes all of the templates.Fig. 2 shows some examples of the algorithm output against K1 and K2 datasets.www.ijacsa.thesai.orgWe have also used four datasets provided by K. Mikolajczyk [24] to check how robust our algorithm is to variations in brightness-contrast (Leuven dataset), focus blur (Bikes and Trees datasets) and JPEG compression (UBC dataset).Each one of these four sets consists of 6 images.The amount of distortion increases as the image number increases from 1 to 6.We have extracted 20 random non-overlapped templates (89×89 pixels) from the first image of each one of the four datasets, and in each set, we have tested FRoTeMa along with Forapro [18] and Ciratefi [19] against all of the 6 images using all of the 20 templates.So, we have evaluated 360 queries for each set.Fig. 3 illustrates some examples of FRoTeMa output against the tested Mikolajczyk datasets.When testing FRoTeMa against K1 and K2, we have checked 36 angles and 6 scales due to datasets challenge, while it was more appropriate when querying the other datasets to check one scale and orientation as they almost don't change.Parameters of the tested algorithms have been used as recommended by [18], [19] and [23].Table I summarizes the total execution time on AI, SI and K2 datasets of FRoTeMa and the mentioned eight known algorithms.
Table I shows how FRoTeMa can provide an exceptional speed-up over all of the tested existing algorithms.The overall numbers of possible matches when considering the number of pixels, scales and orientations checked for K2, Leuven, Bikes, Trees and UBC datasets are 34560000, 540000, 700000, 700000 and 512000 respectively, while the average number of candidate states tested for each single query when testing FRoTeMa against such datasets was just 432.1, 42.3, 50.6, 283.7 and 31.6 states respectively.This justifies why FRoTeMa provides such great speed which makes a much difference regarding time-sensitive applications.
For K1, K2, Leuven, Bikes, Trees and UBC datasets, the output of the algorithms was considered correct if the overlap error is less than 15%.This is a very strict condition if we realize that Mikolajczyk et al. [25] consider that regions with up to 50% overlap error can still be considered matched successfully using robust descriptors.For AI and SI datasets, the ground truth is not available, so a match was considered correct if it was within (±4; ±4) pixels of the output location of the OptA [20] algorithm that claims an exhaustive-like accuracy.The performance of algorithms is given in terms of recall: TP/(TP+FN), where TP is True Positive and FN is False Negative.In Table II, performance of FRoTeMa against Forapro [18] and Ciratefi [19] in the tested datasets is illustrated.Table II shows that besides its superiority in speed, FRoTeMa exhibited 100% accuracy.www.ijacsa.thesai.org

B. Parameters
FRoTeMa sensitivity to different choices of , , and parameters has been tested.We have examined each one of the parameters while the others are fixed when searching for a template in one image of K2.We have checked 36 angles and 6 scales (from 0.7 to 1.4 of the original template size).In each of the following tables, the fixed parameters can be found in the first line.Table V shows that parameter has a broad range of values that will not affect the correctness (above 0.04).However, it has a big influence on the number of candidate states.So, large values of may make the algorithm slower.Table VI shows that after some value (around 0.04 in this test), the value has no influence on the result.It also doesn't affect the number of candidate states.So, the choice of this value is not critical in terms of its effect on correctness or speed within a wide range of values.www.ijacsa.thesai.orgV. CONCLUSIONS Template matching plays a vital role in image processing and computer vision.It is heavily used in many applications in several diverse areas.In this Paper, We presented FRoTeMa, a new rotation, scale, brightness and contrast invariant template matching algorithm.The proposed method is also rotation/scale-discriminating within a predefined level of accuracy.Algorithm analysis and experimental results using eight datasets and comparisons with eight known methods show that FRoTeMa can provide an exceptional speed which is more suitable for time-sensitive applications while maintaining exhaustive-like search.Future work is aimed at checking whether the proposed method will be able to handle the color template matching problem and deal with color constancy.Also, it will be interesting to investigate the performance of the proposed algorithm after converting it to a parallel one.

Fig. 2 .
Fig. 2. (a) An example of FRoTeMa output when queried using the illustrated template from K1.(b) Another example of FRoTeMa output when queried using the illustrated template from K2 (note the discriminated angles and scales)

Fig. 3 .
Fig. 3. Some examples of FRoTeMa output when tested against Mikolajczyk datasets [24].For each couple of images, the upper one is the first image in the dataset and the lower one is the sixth image in the dataset when queried against the same template

TABLE I .
TOTAL EXECUTION TIME (SEC) AND SPEED-UP RATIO ON AI, SI AND K2 DATASETS.FROTEMA IS COMPARED WITH EIGHT EXISTING ALGORITHMS (SUR REFERS TO SPEED-UP RATIO)

TABLE II .
[19]ENT PERFORMANCE OF THE PROPOSED ALGORITHM AGAINST FORAPRO[18]AND CIRATEFI[19]IN THE TESTED DATASETS.* MEANS THAT THE EXPERIMENT WAS NOT DONE AND NOQ REFERS TO NUMBER OF QUERIES

TABLE III .
SENSITIVITY TO

TABLE IV
in Table IV that doesn't affect the correctness of the results after some small value (0.007).Also we can note that the numbers of candidate states are comparable within a wide range of values.

TABLE VI .
SENSITIVITY TO