Improving Video Streams Summarization Using Synthetic Noisy Video Data

For monitoring public domains, surveillance camera systems are used. Reviewing and processing any subsequences from large amount of raw video streams is time and space consuming. Many efficient approaches of video summarization were proposed to reduce the amount of irrelevant information. Most of these approaches do not take into consideration the illumination or lighting changes that cause noise in video sequences. In this work, video summarization algorithm for video streams has been proposed using Histogram of Oriented Gradient and Correlation coefficients techniques. This algorithm has been applied on the proposed multi-model dataset which is created by combining the original data and the dynamic synthetic data. This dynamic data is proposed using Random Number Generator function. Experiments on this dataset showed the effectiveness of the proposed algorithm compared with traditional dataset. Keywords—Video summarization; Histogram of Oriented Gradient (HOG); Correlation coefficients (R); key frames; illumination changes; noise; Random Numbers Generator function


INTRODUCTION
Public places may contain many of stationary cameras (such as banks, transport, airports, etc.) for security requirements.This data has rich information; it should be analyzed in order to get the useful information.Processing and storing this huge data is very difficult.It is very important to summarize this data in order to facilitate many tasks such as data mining tasks.Summarization task is a basic key in data mining.Summarization techniques find a compact description of dataset transforming it to a smaller and suitable form for stream data analysis with the maximum information content [1], [2].
Video summary can take two forms: a static summary [3], [4], which is a set of selected key-frames, or a dynamic video which is a short video constructed by concatenating short video segments [5].Several efficient video summarization approaches have been proposed for surveillance video stream such as [6]- [9].For real-time, generally video summarization approaches utilized motion object detection and extraction as essential process to extract motion information from video sequence [10]- [12].Many of summarization approaches generated a video synopsis or summary for a single video stream such as [13]- [15].Summarization approaches for single camera do not give generalization to multiple cameras and they do not take into account the relationship between the different cameras.Thus, some recent approaches have been proposed to handle the problem of multiple stationary cameras to produce a video summary for related scenes.Xu et al. (2015) in [16] developed a new video summarization framework using clustering techniques that produces s video summary for multiple videos observing the same scene by computing a shared activity among all scenes.Gygli et al. (2015) in [17] proposed a new dynamic video skimming in which a supervised approach was used in order to learn the useful global information of a summary.The result of the proposed method is an optimal video summary that maintain the diversity of the original video.Kuanar et al. (2015) proposed a video summarization approach using a graph theoretic method.The steps can be summarized: Shot boundary detection is achieved depending on Bag of Visual Words and the global feature such as color, texture and shape to remove the redundant frames.The video summary is constructed using Gaussian entropy algorithm [18].Sigari et al. (2015) proposed a fast video summarization using an ondemand feature extraction and a fuzzy inference system.Based on an on-demand feature extraction, the input video is partitioning to highlight and analyze video content.Each highlight is assigned a score using a Fuzzy Inference System.The score value indicates the importance of the events occurred in the highlight [19].Bian et al., (2015) proposed a video event summarization method which comprised three stages: (1) noise removal.(2) Discovering sub events from multiple data types.(3) Generating visualized summary from the microblog streams of multiple data types [38].Fu et al. proposed a summarization technique using the spatio-temporal shot graph, then the shot graph is divided and clusters of event-centered shots with similar contents are constructed.The video summary is produced by solving a multi-objective optimization problem by shot importance evaluated using a Gaussian entropy fusion technique [39].Zheng et al. (2015) proposed a novel surveillance videos summarization.The motion feature is extracted using graphics processing units GPU to reduce running time.Then, the result of this step is smoothed to reduce noise, and finally, the video summary is created by selecting frames with local maxima of motion information [40].
Most of these approaches for video summarization do not take into account temporal noise which occurs in scenes under illumination changes or light changes.Noise is a common problem in digital cameras due to some of errors may occur in one of two sensor cameras, or ambiguity in some of the sensor data that is exposed to noise.The video summarization that based on motion detection with noisy video produces wrong motion object vectors.For real world applications, the feature extraction in computer vision and image processing should be robust to brightness or illumination changes or to frame distortion such as noise or blur.The illumination or brightness changes of some points between consecutive frames in video frames sequence often occur due to variations in parameters of different video cameras, or moving of objects from one part to another part of the scene can be changed with different illuminations [20], [21].Presence of these issues will cause an inaccurate processing of the video stream.Most of the video summarization methods for a single static camera or multicamera video do not take into account to illumination changes or to the existing noise signals which are occurred in some of video frames.They depended on the assumption that noise or illumination values are static along video frames.
In this research, multi-sensor video summarization algorithm has been proposed based on Histogram of Oriented Gradient (HOG) procedure which is used as feature extraction and robust similarity or dissimilarity measure which is Correlation Coefficients approach.Unlike some video summarization approaches in the literature, the proposed algorithm framework is not operated directly on the raw pixels.The algorithm uses the feature vector for each frame in video sequence in order to improve the motion detection accuracy under illumination variance and shadowing.Thus, (HOG) is selected for this purpose.
The availability of real or representative data is an important issue for evaluating data analysis algorithms.Because of some of real data are lack or difficult to obtain, synthetic data becomes alternative data.In many research areas such as data mining, image processing, computer vision, sensor networks, and artificial intelligence developed different synthetic data generation schemes for different applications [22], [23].Corruption of data may come from noise or blur which sometimes comes from different atmospheric conditions.Many approaches used synthetic multi-temporal data generated by Gaussian noise in order to test and evaluate their proposed approaches such as [24]- [26].These techniques produced the traditional Gaussian noise which is identically distributed noise.Meaning that, the noise values at all pixel locations in all sequenced data are generated from the probability density function with the static mean and the standard deviation values.In this work, the developed synthetic data generation has been proposed.The proposed synthetic data generation method generates sequenced frames using Random Number Generator (RNG) function in order to simulate the original video sequence containing variant noisy frames.
The paper rest is organized as follows: in Section II, background theories that are related to this work are presented.The proposed methodology is provided in Section III.The experimental results and performance assessment are provided in Section IV and V. Finally, Section VI draws conclusions of this work.

A. Histogram of Oriented Gradient
Histogram of Oriented Gradient HOG is features extraction technique and the most successful used to extract low-level features for object detection and recognition.HOG, can be found in [27], and was originally designed for human detection [28], [29].It has low and computational time [30] and robust against shadow and illumination changes [21], [31].The HOG algorithm for an image can be implemented by four main steps [33]: 1) gradient computation, 2) orientation binning 3) descriptor blocks, and 4) block normalization.

B. Correlation Coefficients
The correlation coefficient which is a measure of the calculating the score of relationship similarity between two variables x and y can be defined as [32]: Where   and   are gray level values of -ℎ pixel in the first and second frames respectively and  and  are the means of gray level values of x and y.The values of  always fall in [-1, +1].When  is near to 0 meaning that the relationship between the two variables is little and when  is near to 1 the relationship is greater.

C. Noise Modeling
Real environments are often exposed to unexpected situations which are considered as noise.Gaussian noise is the most natural type of noise which is normally distributed.In MATLAB, noisy signal corrupted by Gaussian noise can be obtained by using the following: Where (, ) is the signal with additive Gaussian noise; (, ) is the original signal;  is the standard deviation;  2 is the variance; and  is the mean.() is MALAB function for generating random numbers with a Gaussian normal distribution.The probability density function for a Gaussian distribution with mean  and variance  2 can be defined as:

D. Performance Evalustion Metrics
The performance evalustion of the proposed video summarization algorithm has been achieved using three types of metrics: (1) Data compression ratio (DCR) which is the ratio between number of frames in the original video and number of frames in the summary video [34], (2) Space savings ( )) [35], and (3) Condensed Ratio (CR) which is the ratio between number of frames in the summary video and number of frames in the original video [36].

A. The Proposed Method for Synthetic Data Generation
The standard deviation or the variance is the power of Gaussian noise signal.The classical Gaussian noise generator has the same approximated linear power for all given video frames.In dynamic environment, noise features are usually changed over time, it is necessary to use a simulation technique in order to adjust noise features.In this paper, the developed algorithm for generating synthetic video sequence Where  (d=1) for a single generated random number, (, ) is noisy frame signal, (, ) is the original frame.

B. The Proposed Multi-Model Video Summarization
Multi-model video summarization using Histogram Oriented Gradient (HOG) algorithm for features extraction and Correlation Coefficient as a measure that quantifies the dependency (independency) between two video sequences.This algorithm has been applied on the proposed multi-model video dataset (for two videos).The videos have equal length (n frames).To increase efficiency of the proposed algorithm, HOG is used, which is a faster process and has low small features space.HOG procedure is computed for the current frame   (  = 1. .) from  1 (original video or reference video) and the current frame   (  = 1. .) from  2 (the corresponding noisy video of the original video).The results are two feature vectors.Correlation Coefficients (R) is computed between these feature vectors.Algorithm (1) illustrates the proposed HOG-Correlation Coefficients algorithm.

Algorithm (𝟏) ∶ HOG-Correlation based Summarization
Input: Vid 1 is the original video sequence Vid 2 is the noisy video sequence n is the number of video frames.Output: video summary Steps: 1.Given two video streams, Vid 1 and Vid 2 .2. For i = 1 to n 3.
Read the current frame (f i ) from the video stream (Vid 1 ).

4.
Read the current frame (g i ) from the video stream (Vid 2 ).
Store the current original frame into summary file.9.
else go to step 2 (disregard the current frame ) 10. end To detect redundancy (noisy) frames, we put a constraint on the correlation coefficients that are computed from the step 3.If  >= 0.9 then the current frame from the original video sequence is stored into a summary file, otherwise go to step 2 (the current frames is dropped), and then continuously detect redundant frames for all frames.

IV. EXPERIMENTAL RESULTS
The original samples video datasets for metro and road scenes were selected from [37].Using these videos, the synthetic data has been generated.The original and the synthetic video were used to construct the multi-model dataset for testing and evaluating the proposed summarization method.
The results analysis of the comparison between the classical synthetic video generation algorithm and the proposed synthetic video generation algorithm can be illustrated as follows: 1) The random noise is in normal distribution case (0,1), (µ) = 0 and  ( 2 ) =1, the result contains high noise values (Mean Square Error (MSE) = 6.4784e+03).
2) The random noise is in general distribution case (,  2 ), the result contains very high noise values.
3) µ and  2 have values in the range [0, 1] , the result will contain lower noise values.The noise values close to zero when  2 is near or equal to zero and the result will become similar to the original signal ( 2 = 0.1, MSE= 923.0849 and  2 = 0.1, MSE= 321.7665).In Fig. 3, and Fig. 4, when the noise values are high, the summarization of the video streams is not occurred because the correlation coefficients are smaller than the threshold ( < 0.9) .When the noise values are small, all correlation coefficients will be greater than the threshold ( > 0.9) .In this case, the summarization of the video stream is also not occurred and the output video has redundant frames like the redundant frames in the original video frames.

4) Applying the proposed algorithm on the input video sequence is achieved, the result contains high noise components, but these noise components are variant among all frames. Randomly, some of the frames have high values and others have low values depending on the random values of the variance
In order to generate a video sequence containing variant noisy pixels.The proposed algorithm for generating noisy video with dynamic and variant noise values has been applied to solve the above problem.The result of this algorithm is a noisy video which is used with the original (reference) video to form a multi-model dataset.The HOG-correlation based algorithm has been applied on this multi-model dataset.The correlation coefficients () have different values.the values that are greater than threshold ( > 0.9) are saved, otherwise are removed.The output is a video summary containing important information without redundant and noisy frames.

V. PERFORMANCE ASSESSMENT
The HOG-correlation algorithm for video stream sumrization have been tested on the video sequence (metro scene).The size of short sample of the original metro video is equal to 46.9 MB of 3001 frames.The Table 2 shows the information details before and after applying the HOGcorrelation algorithm on the multi-model dataset (the original and the synthetic videos).As shown in Table2, V1 is the video output of HOG-correlation algorithm between the original metro video and the video generated using the traditional noise video generation algorithm, the size of V1 in the memory space is equal 44.9 MB because the original data which is 3D video is converted to 2D video.But the number of frames of V1 is still the same numbers of the original one.
V2 is the output of HOG-correlation algorithm between the original metro video and video generated using the proposed algorithm of dynamic noise generation with using random numbers.The size of V2 is equal to 1.09 MB and the number of frames is only 69 after applying the HOGcorrelation based algorithm.From the experimental results, the HOG-correlation algorithm gives better results when it is applied on the original video and the video generated by the developed method of dynamic noise generation based on generated random numbers.As a result, it has ability to reduce the size of input video sequence and extract important motion information.
Depending on above information, the performance of HOG-correlation algorithm has been tested.Table3 ilustraites the performance evalustion of the proposed algorithm using three metrics: Data compression ratio (DCR), Space savings (  ), and Condensed Ratio (CR).As demonstrated in Table3 for the metro video sequence, DCR which is the ratio between the incompact size and compact size gives very good results for V2 against the value of DCR for V1.The   , the compact in size relative to the incompact size, for V2 gives better results against the value of   for V1.The CR, the ratio between the number of output frames and the number of input frames, also gives very good results (97.70%) for V2 against the value of CR for V1.
From the above performance evaluation results, the HOGcorrelation algorithm works better on the noisy video that is ctreated using random numbers.

VI. CONCLUSION AND FUTURE WORK
In real time applications such as surveillance applications, illumination changes or shadowing for motion objects may occur in surveillance video stream.Many video summarization methods that are trying to construct video summary depended on the assumption that noise or illumination values are static along video frames.This leading to the existing video summarization algorithms will stumble in understanding of scene under observation.In this paper, the synthetic noisy video generation algorithm has been developed for testing the proposed video summarization algorithm based on Histogram Oriented Gradient and Correlation Coefficient for multi-model video dataset.The experimental results on the proposed dataset showed good results compared with the classical dataset.For more efficient time and space computation, online video summarization using Histogram of Oriented Gradient and Correlation coefficients techniques will be generated as a future work.
uses MATLAB random number generator function, the output of this function is added in (2) instead of the global  values among frames.The new proposed equation can be defined as: (, ) = (, ) +  () * �()� +  (4)

Fig. 1 Fig. 1 .
Fig. 1 illustrates an example the visual comparison between the classical synthetic noisy data and the proposed synthetic noisy data.

TABLE II .
THE INFORMATION DETAILS ( VIDEO SIZE AND NUMBER OF FRAMES) BEFORE AND AFTER APPLYING THE PROPOSED CORRELATION BASED METHOD ON THE ORIGINAL METRO VIDEO AND THE SYNTITHICS VIDEOS DATASETS

TABLE III .
THE RESULTS OF THE PERFORMANCE EVALUATION OF CORRELATION BASED ALGORITHM FOR METRO VIDEO SEQUENCE