A Novel Approach for Background Subtraction using Generalized Rayleigh Distribution

—Identification of the foreground objects in dynamic scenario video images is an exigent task, when compared to static scenes. In contrast to motionless images, video sequences offer more information concerning how items and circumstances change over time. Pixel based comparisons are carried out to categorize the foreground and the background based on frame difference methodology. In order to have more precise object identification, the threshold value is made static during both the cases, to improve the recognition accuracy, adaptive threshold values are estimated for both the methods. The current article also highlights a methodology using Generalized Rayleigh Distribution (GRD). Experimentation is conducted using benchmark video images and the derived outputs are evaluated using a quantitate approach.


I. INTRODUCTION
The most imperative characteristic of an intelligent vision based inspection system is background subtraction, which is considered to be a primitive step for object recognition and tracking.Typically, pixel by pixel comparison is practiced for either detection or tracking with a predefined object dataset.However, this procedure of searching and comparing against each pixel requires a huge computational time and as an improvement to this approach, background subtraction methods are coined for the optimization of both search and computational time.In many of the Human computer interactive systems, background subtraction is considered during the pre-processing procedure to optimize the cost.As such, background subtraction has become a significant method and has deeply penetrated with strong roots in the area of computer vision.Since background modelling considerably influences the performance on the whole vision system, it is imperative to make use of an excellent background subtraction methodology.However, most of the background modelling techniques need to combat the challenges due to dynamic or non-static backgrounds, unexpected or steady lighting changes; motion in the object and shade, Background modelling methods should intelligently overcome such issues.To overcome these challenges, many models are presented in the literature [1][2][3][4][5][6][7][8][9][10][11], [13], [15], [16], [17], [18].

II. LITERATURE REVIEW
Haiying et al [1] has proposed a modified Gaussian mixture model based on the optimization of the GMM and Combining the spatial information.H.Zhou et al [2] have proposed a foreground detection methodology in which the authors have tried to improvise the codebook.Viswanth et al [3] suggested and modeled an approach using non parametric background modeling.In this approach, a single Spatio Temporal Gaussian is used for modeling the back ground pixels.However, this methodology fails as the adequate features are not obtainable from the section.Yuhan.L et al [4] proposed a robust back ground subtraction methodology based on the adaptive dictionary strategy and penalized splitting approach.Lu yang et al [5] considered a pixel for modeling the background information in case of complex scenes.The consideration pixel manipulates the distance between the pixels and it was used for updating the back ground model as a substitute of local descriptors.Chen et al [6] proposed a model using varying learning rate and also adaptively selecting the number of Gaussians.This model performs better in particular cases of extraction of dynamic background information and sudden illumination variations.However, this model cannot handle strong dynamic back ground and also fails in case of capturing the paused objects.Chien et al [7] proposed a foreground object detection method by using a threshold value.In this article, the authors have assumed that the camera considered for capturing the videos are tolerant to noise and posses a zero -mean Gaussian distribution.But this assumption has affected the selection of the threshold.Lui et al [8] proposed an approach based on the binary descriptors.In this article, the authors have generated the back ground instances using binary descriptors.The developed model has proved to be robust against lighting changes and dynamic back ground and is tested against the environmental changes.
Haung et al [10] proposed a method for back ground modeling based on binary descriptors.However, this method can reduce the effect of noise and capable for the extraction of rough shaped images from the foreground objects.Hedayathi et al [11] proposed a statistical frame work for back ground subtraction, which acquired better performance in terms of segmentation.
Stephan Kopf"s et al [15] has proposed a model for automatic scaling and cropping.The main limitation with reference to this approach is that, this methodology needs preidentification of certain parameters which helps during the cropping of the selected regions.In the motion detection, the objects can be recognized only when the object is in moving condition.Therefore, appropriate reorganization of objects can www.ijacsa.thesai.orgbe well planned, only if the back ground information from these motion images is subtracted.This methodology is very much useful for recognization of images acquired from surveillance cameras.Stauffer and Grim Son [18] have presented an approach using pixel wise operation for the identification of the back ground images.The main restraint of this model is that, extracting the back ground pixel information from the static camera is relatively difficult and therefore, this methodology leaves an unsolved issue about the problem with respect to the images acquired from the static cameras.Tao Mei and Xian et al [16] have presented a model for background subtraction, where image mosiacing is considered.The limitation of this approach is that, mosaicing of background information is highly impossible.AL-Najdawi et al [13] have utilized the kalman filter for the purpose of seeking optimal estimation in tracking.D.Farin, P.de et al [17] have proposed a model for the extraction of individual frames.The limitation of this model is with respect to identification of back ground pixels from the frames from the static cameras.D. Hari Hara Santosh et al [9] have utilized the Gaussian Mixture models for the effective identification of the foreground information.In this article, the authors have addressed the concepts of object tracking using Blob Analysis.However, this methodology has its limitations while dealing with sudden, drastic lighting changes.
To overcome these challenges many techniques were therefore planned, in particular by considering mostly the statistical frame works, with the very criteria that, the efficiency of statistical models will be relatively high and helps towards better identification of the object pixels more appropriately.Based on this approach, many models have been further developed using the mixture models, such as Gaussian Mixture models and models based on the Rayleigh distribution.
The Rayleigh distribution cannot model the natural images having specals, impulsive behaviour leading to heavy tails.Therefore in this article, an attempt is made to overcome the limitation highlighted and proposes a methodology by considering a Generalized Rayleigh distribution (GRD).The main advantage of using the GRD is that, it is more accurate in the reverberation regions.
The rest of the paper is organized as in Section III where the Generalized Rayleigh distribution is considered.Section IV presents the details about data sets and Section V emphasizes the methodology.Section VI the experimentation carried out is highlighted.The performance evaluation and the results derived are highlighted in Section VII.The Section VIII concludes the paper and describes the future scope.

III. GENERALIZED RAYLEIGH DISTRIBUTION
The early roots of considering Rayleigh Distribution for enhancing the Background Subtraction is considered by Michael Unger et al [19].In this article, the authors have considered the model for Background Subtraction for unconstrained images acquired from cameras that are in motion, as they have low resolutions and less distortion.
However, with the availability of sophisticated technological cameras, capturing the images with different resolutions is highly difficult.Hence, to overcome this disadvantage, the work presented by Michael Unger et al [19] has been extended by considering Generalized Rayleigh Distribution.The main advantage of proposed method is that, the generalization process allows estimating the back ground images in particular situations where the information is suppressed.Beyond the above mentioned advantage, the model also includes, 1) It has the advantage over the other distribution, where the degree of freedom can be easily obtained using the maximum value which is generally unique.
2) These distributions are maximum when it approaches towards Y-axis.
3) The back ground information can easily be interpreted using the Rayleigh distribution with the maximum value.
Another limitation with respect to the Rayleigh distribution is that it can"t handle specals having heavy tails.Therefore, the generalizations of the models help to overcome these limitations.Hence, as a contributing factor in this article, we propose a model based on generalized distribution to overcome the limitations of Rayleigh distribution.This distribution has an advantage to handle both high tail and low tail impulsive noises.
A continuous random variable is said to follow a Rayleigh distribution if its probability density function (pdf) is given by = 0 , otherwise The Cumulative distribution function (cdf) is given by the formula = 0 , otherwise Where , where , is the scale parameter.

IV. DATA SET
To exhibit the proposed work, a Bench mark data set of Video images from www.changedetection.net[12] has been considered for experimentation.The dataset consists of 6 different video categories with a total of 31 videos comprising of 90,000 frames.These videos are mainly based on Baseline, Camera Jitter, Dynamic Background, Intermittent Object Motion, Shadow and Thermal.
V. METHODOLOGY 1) Post processing: In order to extract the background image, each of the inputs has to be first preprocessed by considering the pixels having least deviation.The least deviation pixels have to be considered as background pixels.However, the main limitation in choosing the background pixels is that, in particular situation, the foreground and background shall share similar information with respect to color, size and orientation.Therefore, whenever we need to www.ijacsa.thesai.orgestimate the background images, lighting conditions play a vital role.This lighting condition, if interpreted exactly helps to model the background images.In this article, the usage of Generalized Rayleigh Distribution helps to overcome this disadvantage because of its ability to handle low illumination images.
2) Back ground subtraction: In background subtraction technique, motion objects were identified by deducting the present image from the background image.The initial frame of the video progression was taken as reference image for background frame.The present frame will be deducted from the considered background frame.The background pixel is decided on the basis of the resultant difference, i.e. if the output of the subtraction reference pixel value is greater than the reference pixel value, then it is considered to be a background pixel, else it is considered as a foreground pixel.
3) Frame difference method: Here, we estimate the difference in values between two consecutive frames, "t " and "t-1".If the resultant value is better, then the value is taken to be the threshold value, and the pixel will be treated as background pixel.
The estimated threshold values, from both the cases are considered and are given as input to the model Generalized Rayleigh Distribution proposed in section III of the article.
The probability density functions (pdf) against each of the intensity values are given as input to the model and the respective values are estimated.These values which are below the threshold value are considered as background information else they are considered as foreground information.

4) Fusion technique:
In this article, two methods for estimating the background pixel, viz., background subtraction method and frame difference method are highlighted.However, each of these methods have their own limitations, i.e., if we consider the background subtraction method, the boundaries and contour will be intact, however the output result may be affected due to the noise parameter, in contrary, in frame difference method, result will have minimal impact due to noise, but in this case, the complete information regarding the boundaries and contours may be a bit influenced.To overcome these limitations, in this article, we have considered the fusion concept using "AND" operation.

5) Adaptive background subtraction:
The best possible threshold value can be estimated using the adaptive threshold technique and it is estimated using the formula given by F(x,y)=C(x,y)-R(x,y) F(x, y) =1, if F(x, y) ≥ T (where T is the Threshold value, ( using the methodology proposed by N. Otsu [21] ) and zero otherwise.Here C(x, y) denotes the current frame, and R(x, y) represents the considered reference background image, F(x, y) denotes, the deviation between the present frame and the reference frame.

6) Frame differencing method and Adaptive frame difference fused methods:
Here the optimal threshold values are estimated in line with the heuristics given by W. Jun-Qin [22].Adaptive background subtraction the difference between video frame at time t and the frame at time t-1.The optimal threshold is thereby estimated.In case of Adaptive frame difference fused method, the choice of the adaptive threshold value is based on the difference value obtained by subtracting the background reference frame from the current frame and then these values are fused to get a unique threshold value.

VI. EXPERIMENTATION
Blob analysis is considered for the effective identification of the background and foreground regions.Each pixel value is extracted based on threshold values obtained from background subtraction method and frame difference methods and the corresponding pixels are categorized into either background or foreground.In general, pixels with minimum threshold values will be mostly considered as background pixels.The pixels with high threshold values are given as inputs to the Generalized Rayleigh Distribution (GRD) presented in section III.Basing on the log likelihood estimation of the pixels, each pixel is categorized either as a back ground pixel or a foreground pixel.The experimentation is carried out in matlab environment and the results obtained are shown below.In this article, we have experimented with numerous ways of estimating the background pixel; namely Fusion method, adaptive Background subtraction, Adaptive frame difference, Adaptive frame difference fused with Adaptive background subtraction.The significance of each of these methods are presented in section V.The results were also compared with the model based on Gaussian Mixture Model.

VII. PERFORMANCE EVALUATION AND EXPERIMENTAL RESULTS
In order to validate the model, we have considered the performance for quantitative analysis are metrics Precision, Recall, Accuracy, F-Score, MSE, RMSE, FNR, FPR, PSNR [20].In order to validate the model, we have performed the experimentation with different frames 257,863,1005,1954 respectively.The formulas for the identification of Precision, Recall, Accuracy, F-Score.Recall is expressed in terms of the number of allocated foreground pixels to that of actual foreground pixels; and the evaluation outcome of this metric showcase, the exact number of true foreground pixels that are classified as foreground pixels.Precision is defined in terms of the number of exact foreground pixels against the allocated foreground pixels; it signifies the exactness of the pixels that were classified as true foreground pixels against the allocated foreground pixels.The performance of the model can be justified by the value of calculated precision, if it is high, it signifies high performance.On the other hand, if method allocates the majority of the pixels to background, the output precision value may be high, but proportionally, the value of recall declines.To identify the trade-off between recall and precision, F-measure is also considered.The other performance metrics considered include; Mean Squared Error (MSE).Root Mean Square Error (RMSE), False Negative Rate (FNR) False Positive Rate (FPR) and Peak Signal to Noise Ratio (PSNR) [20] www.ijacsa.thesai.org The formulas for the calculation of the above metrics are given by Precision = TP / (TP+FP) ( Where, TP-the number of Foreground pixels classified as foreground, FN-the number of Foreground pixels classified as background, FP-the number of pixels of background pixels classified as foreground, TN-the number of back ground pixels classified as background.
Experimentation is performed with the developed model, by considering the data set presented in the section-IV.The results derived are presented in the following Fiig-1 1. Original frame.2. Ground truth .3.Back ground subtraction.4.frame difference.5 .Back ground subtraction and frame difference .6.Adaptive back subtraction.7. Adaptive frame difference.8. Adaptive back ground subtraction and adaptive frame difference.9. GMM. 10.GRD.
We evaluated the different background modelling methods discussed in sectionV.The scenarios used to evaluate different methods thermal, baseline, dynamic background, shadow.There are many videos for each scenario.We selected one typical frame work from each video.(d) shows foreground detection results of every method.Fig .32 shows that their F -score is very high when compared with other methods.www.ijacsa.thesai.org
Fig 1 a-d are selected from four categories in the CDNet 2014 dataset.Fig .1 (1) show the original frame of the video and Fig 1. (2) are the results of the ground truth data .Fig 1.(3)-(10)are the foreground detection results of the state of the art background modelling methods.Tabel I-IV presents nine performance evaluation metrics of the eight back ground modeling methods in the CDNet 2014 dataset.The performance of the different methods can be confirmed by the recall, precision,F-Score and other metrics.For each evaluation metric, we give the results of the back ground modeling methods in different Scenes via Figs.2 to 37. Thermal: As shown on Fig.1 (a), their results are closer to the ground truth data.Fig .5 indicates the F-Score are greater than 50% Base line: These videos contain a noise free static back ground Fig.1 (b) shows foreground detection results of every method.The proposed method (GRD) successfully detected the foreground object.It can also observed that the F-Score of each method in Fig.14 is very high, greater than 63%.Dynamic background: As shown in Fig 1 (c), the proposed method (GRD) is more effective than the other methods when dealing with dynamic backgrounds.Fig .21shows that their Recall is very high when compared with other methods.Shadows: The methods differ in the capability of classifying shadow pixels as back grounds.As shown in Fig 1.

Fig. 23 .
Fig. 23.F-Score of Different Methods on Dynamic Background.

TABLE I .
EVALUATION METRICS OF DIFFERENT METHODS ON THREMAL VIDEO FROM CD NET DATASET

TABLE III .
EVALUATION METRICS OF DIFFERENT METHODS ON DYNAMIC BACKGROUND VIDEO FROM CD NET DATASET

TABLE IV .
EVALUATION METRICS OF DIFFERENT METHODS ON SHADOW VIDEO FROM CD NET DATASET