Resolution Enhancement by Incorporating Segmentation-based Optical Flow Estimation

In this paper, the problem of recovering a high-resolution frame from a sequence of low-resolution frames is considered. High-resolution reconstruction process highly depends on image registration step. Typical resolution enhancement techniques use global motion estimation technique. However, in general, video frames cannot be related through global motion due to the arbitrary individual pixel movement between frame pairs. To overcome this problem, we propose to employ segmentation-based optical flow estimation technique for motion estimation with a modified model for frame alignment. To do that, we incorporate the segmentation with the optical flow estimation in two-stage optical flow estimation. In the first stage, a reference image is segmented into homogeneous regions. In the second stage, the optical flow is estimated for each region rather than pixels or blocks. Then, the frame alignment is accomplished by optimizing the cost function that consists of L1-norm of the difference between the interpolated low-resolution (LR) frames and the simulated LR frames. The experimental results demonstrate that using segmentation-based optical flow estimation in motion estimation step with the modified alignment model works better than other motion models such as affine, and conventional optical flow motion models.


INTRODUCTION
Multi-frame super resolution (SR) is the process of producing a resolution-enhanced frame from multiple lowresolution (LR) frames with sub-pixel shift.SR received much attention in computer vision and image processing communities over the past three decades  (for review see [6]).SR process typically includes three steps: (i) image registration (motion estimation), (ii) the alignment of LR frames on the high-resolution (HR) grid, and (iii) image restoration.
SR methods can be categorized, based on the domain in which the process is done, into time domain [1-5, 7-8, 10-21] and frequency domain [9].In another way, they can be categorized, based on the incorporated motion model, into global motion model (including translation [3] and affine [2] models) and local motion model (including optical flow [7,12] and block matching [13]).Furthermore, they can be categorized, based on the alignment process, into non-uniform interpolation [8], deterministic and stochastic regularization [10], and projection onto convex sets (POCS) [11].
Most of the existing super-resolution algorithms [2][3][9][10][11] cannot cope with local motion, because they assume that motion model can be globally parameterized.To overcome the problems of registration error in locally moving parts, three techniques appeared in the literature.The first is to use different global (or local) weights for different registration error levels [14,16].The second is to use local motion (or multi-motion) estimation to improve the accuracy of registration in the locally moving parts [7,12,15,19].The third is a combination of the previous two techniques [17,18].Among these techniques, the first technique is widely used in SR for its simplicity.The idea of using different weights for different registration is based on rejecting pixels or even whole frames that have high registration error.On the other hand, the main idea behind using local motion estimation techniques is to incorporate information from different frames as much as possible [12,15,19].The main problem of multimotion estimation [19] is the complexity, since it requires estimation of motion for each moving object in all frames, which is complex and not always accurate because of the fact that the motion of different object affects the motion of other objects.Also, using conventional optical flow estimation for motion estimation [7,12] is sensitive to noise.In addition, using block-matching results in blocking artefacts due to dividing frames into blocks and incorrectly assuming that all pixels within each block have the same motion vector.Moreover, using combination of local motion estimation and weighting technique adds more complexity to the algorithm as proposed in [17,18].In addition, algorithm proposed in [15] requires high computational time, since it perform region matching using full search.
On the other hand, region segmentation has been employed for image SR in [14,15,20,21].In [14], segmentation is employed in a region rejection based on the registration inaccuracies.While in [15], region is employed in a region matching in the registration step.In [20], a region-based super-resolution algorithm is proposed in which different filters are used according to the type of region.In this method the segmentation information is not fully used where it is used only to classify regions into homogeneous and inhomogeneous regions.In [21], the image is segmented into background and different objects and each of these are super-resolved separately using a traditional technique and then the superresolved regions are merged to construct the HR image.This algorithm is very complex since it requires segmentation of moving objects and registration of each object separately.www.ijacsa.thesai.org In order to overcome these difficulties, we propose a modified gradient-based optical flow estimation algorithm based on the Horn-Schunck algorithm, so that tries to overcome the problems of conventional optical flow estimation by the use of image segmentation.
Based on the assumption that pixels in a homogeneous arbitrary shaped region have the same motion then motion discontinuities go along with discontinuities in the intensity image.In addition, using segmentation reduces the susceptibility to noise.Furthermore, we propose to incorporate the modified gradient-based optical flow estimation algorithm for estimating motion of each region in image registration.
Our claim in this work is that the enhancement of the motion estimation results leads to enhancement in the SR results.Therefore, we establish a framework of multi-frame SR method that utilizes segmentation-based optical flow estimation.
The method in the proposed framework is summarized as follows.First, the frames are interpolated to make optical flow estimation in sub-pixel accuracy, and then the interpolated reference frame is segmented into arbitrary shaped regions using watershed transform [22].
The locally moving segments are then motion compensated using search.The next step is image alignment, where we fuse the motion compensated frames to produce one blurred HR frame by using the L 1 -norm in the frame alignment step.The so generated HR frame is de-blurred by using a regularization-based restoration method.

II. PROBLEM DESCRIBTION
The problem of multi-frame SR is to estimate a highresolution frame out of observed successive LR frames.Assume that N LR frames of the same scene denoted by Y k (1 ≤ k ≤ N), each containing M 2 pixels, are observed, and they are generated from the HR frame denoted by X, containing L 2 pixels, where L ≥ M. The observation of N LR frames are modelled as follows: (1) where F k , H and D are the motion, blurring, and downsampling operators, respectively.The size of F k , H and D are L 2 × L 2 , L 2 × L 2 and M 2 × L 2 , respectively.X is the unknown HR frame, Y k is the k-th observed LR frame, and V k is an additive random noise for the k-th frame with the same size as Y k .
Throughout the paper, we assume that D and H are known and the additive noise is Gaussian.The problem here in this paper is to estimate the motion for each frame, F k , and to find the original image, X.
The assumption of known and constant down-sampling operator is a practical assumption, since in resolution enhancement applications, the enhancement factor is determined by the user and then the down-sampling operator is known.Also, even if in practical applications, the blurring operator is not known, it can be estimated by any of the blur estimation algorithms for example see [27].

III. MODIFIED HORN-SCHUNCK OPTICAL FLOW ESTIMATION
In the conventional Horn-Schunck optical flow estimation algorithm, an equation that relates the changes in image brightness at a point to the motion of the brightness pattern has been derived.Assuming that the image brightness at the point (x,y) in the image plane at time t be denoted by E(x, y, t).The brightness of a particular point in the pattern is constant, so that α is the weighting parameter that weights the importance between and .This problem is solved iteratively as follow And Where n is the iteration number.
The main modification of the Horn-Schunck algorithm is represented by the incorporation of the segmentation information in the optical flow estimation, which done in two stages.The two stages are described as follows.

A. Image Segmentation
Without loss of generality, we used the watershed transform for image segmentation in this work.The watershed transform is a region segmentation approach, in which the image is supposed to take high gradient values in the neighbourhood of edges and low gradient values for interior pixels.The segmentation can be obtained by removing some www.ijacsa.thesai.org of weakest edges, which will create a number of lakes by grouping all the pixels that lie below a certain threshold.This can reduce the influence of noise and reduce the oversegmentation problem.It is then determined for each pixel in which direction the rain would flow if it would fall on the topographic activity surface.The segmentation process is done for the reference frame only to reduce the computational complexity.
The main steps of the watershed segmentation algorithm are summarized as follows (see [22] for details):

B. Segmentation-based Horn-Schunck Algorithm
As Fig. 2 shows an example for the segmented image where each colour represents a homogeneous region, our assumption is that regions' discontinuity is the motion discontinuity.So that, rather than estimating optical flow for each pixel as in [23,24,25] or estimation a single optical flow for each block by assuming that all pixels in a certain block [26] have the same motion, we assume that motion of all pixels in each homogeneous region is the same.Therefore, we suggest modifying Eqs. ( 3) and (4) to be

{ [
] } (7) Where ( ) is the motion vector for all pixels in the region , median x,y {.} is the median value evaluation function.The median value is estimated done for all values of x, y in region .Here we suggested using median function for its robustness against noise.

IV. RESOLUTION ENHANCEMENT BASED ON MODIFIED OPTICAL FLOW ESTIMATION
In this section, we illustrate the method for optimizing a cost function that consists of the error between the simulated LR frames and the interpolated observed LR frames.
The proposed cost function incorporates the effect of local motion.

A. Cost Function
We start with the resolution enhancement problem which can be described as an optimization problem as follows: ) where ||.|| 1 is the L 1 -norm which describes the cost function measuring error and ̃ is defined as the upsampled and interpolated frame from the observed LR frame Y k : ̃ where (.) is the interpolation operator, which is defined on the missed pixels positions only.
On the other hand, the general traditional cost function directly comes from ( 8) is [2,4] ‖ ‖ (9) This cost function is so ill-posed that we need to add a regularization term in practice.Indeed, J 1 [X] and J 2 [X] are related and J 1 [X] includes ``less" ill-posedness than J 2 [X] does as proved in [15].The cost function J 1 [X] is modified as follows Where C is a matrix that represents a Laplacian high pass operator and λ is the regularization parameter.The regularization term in ( 10) is added to solve the ill-posedness www.ijacsa.thesai.org of the inverse problem.The regularization term incorporates the smoothness property of the HR frame.

B. Two Steps-Based Minimization of J 3 [X]
The reconstruction of the HR image (X) can be divided into two independent steps, namely, the fusion and restoration steps.The fusion problem is described as 11) This can be obtained by median filtering as: ̂ ̃ (12) i.e. all the registered frames are fused by using median operator.Then, the problem is reduced to a restoration problem as follows: where ̂ is the estimated version of X (the HR frame).̂ can be obtained from ̂ by using a restoration step as follows.
The steepest decent solution to the minimization problem in ( 14) is: where β is a scalar representing the step size in the direction of the gradient.

A. Data Sets
Two different video sequences including Table Tennis (352×240 SIF format) and Football (352×240 SIF format) are used to evaluate the proposed algorithm.For these sequences, the YUV components are available.Moreover, we assume that the coloured sequences are already demosaicked or captured by three CCD sensors.In addition to test the proposed modification for the Horn-Schunck optical flow estimation algorithm, we test the optical flow estimation with two famous sequences in the field of optical flow estimation, namely, the rotation and div sequences.

B. Experiment Setup
In the simulation, we assumed that the available sequences are HR sequences then we generate LR sequences from these sequences by applying blurring, down-sampling operators and adding noise to the HR sequences.Then we used different SR algorithm to reverse these operations.
The LR frames were generated from the original HR video sequences according to the model as in (1), where the frames were blurred by Gaussian operator (5×5) with the variance equal to 1, down-sampled by a decimation factor equals 2 in the horizontal and vertical directions, and distorted by an additive white Gaussian noise with 30 dB signal-to-noise ratio.
To demonstrate the efficiency of the proposed modification of the Horn-Shunck optical flow estimation algorithm, simulation results are presented for two different image sequences, with different motion types including rotation and div motions, in comparison with the traditional Horn-Schunck [23], and phase-based [25] optical flow estimation algorithms.

C. Optical flow Analysis
Optical flow estimation is one of the main contributions of this work.The efficiency of the proposed modification, which incorporates segmentation with the gradient-based optical flow estimation, is demonstrated as follows.
Fig. 3 shows the results of image sequence that includes rotation motion.Even if phase-based algorithm [25] estimates approximate motion for moving parts, it fails to estimate correct motion at the boundaries as shown in Fig. 3a, where parts of the background are estimated to have high motion.
Fig. 3b shows the results of conventional Horn-Schunck algorithm [23], from this figure, it fails to estimate correct motion at boundaries as shown by estimating motion in the background parts.The main problem of the Horn-Schunck algorithm is treated and solved with the proposed modification s shown in Fig. 3c.
Another example that demonstrates the efficiency of the proposed modification is shown in Fig. 4.This sequence contains div motion.The resulting motion vectors by using phase-based algorithm is scattered and inconsistent for the same region which shows the failure of this algorithm in this sequence as shown in Fig. 4a.Horn-Schunck suffers from the error motion vectors near the boundaries and smooth areas as shown in Fig. 4b.However, the proposed modification has solved this problem as shown in Fig. 4c.

D. Resolution Enhancement Results
Fig. 5 shows the results of SR reconstruction for "Table Tennis" sequence.The segmented regions and the bicubicinterpolated frame are shown in Figs.5a and 5b, respectively.In this figure, the estimated HR frames by incorporating different motion estimation algorithms are shown.From this figure, we can see that as a global motion model, affine motion is not suitable for sequences that contain locally moving objects.This is clear in the disappearance of the ball, as shown in Fig. 5c.
In addition, although conventional optical flow can estimate the flow of each pixel, is very sensitive to noise which results in a noisy edge as shown in Fig. 5d, where the ball is deformed due to the noise effect on the flow estimation process.
On the other hand, segmentation-based optical flow overcomes the problems of the conventional optical flow by assuming a constant motion vector in an arbitrary shaped region (as shown in Fig. 5e).
As another example, the results of Football sequence are shown in Fig. 6.In this figure, the failure of the affine motion is clearer since this sequence contains multi-objects and faster motion as shown in Fig. 6c.In case of using optical flow, the effect of noise is clear at edges, which are not sharp as shown in

VI. CONCLUSION
This paper consists of two main contributions.First, we proposed a modification for the Horn-Schunck optical flow estimation algorithm to overcome problems of conventional gradient-based optical flow estimation algorithms, which are the handling of un-textured regions and the estimation of correct flow vectors near motion discontinuities.We assumed that motion of all pixels in each homogeneous region is the same.Second, we incorporated the proposed segmentationbased optical flow estimation in the resolution enhancement technique.The proposed algorithm overcomes some of the super-resolution problem for video sequences such as the nonrigidity problem where the assumption of general local motion estimates a different motion vector for different parts.The proposed algorithm gave promising results for low resolution sequences with slow/fast motion.

( 2 )
Then, by using chain rule for differentiation, it becomes (smoothness of the optical flow, the smoothness measure term is added in the minimization process.The Laplacian of the xand y-components of the flow are used as the smoothness measure function.So that the optical flow estimation problem is described as the minimization of , Where , (

1 )
Define the floating point activity as | | √ , where (f x ; f y ) is image gradient, 2) Assign a label for each pixel position, 3) For each pixel position find the weak edges, in which the floating point activity value is less than certain threshold (µ) (the thresholding is done for the floating point activity in the positions ep; swp; sp and sep as defined in Fig. 1), 4) Remove these weak edges by merging the corresponding labels, 5) For further merge regions and remove the effect of noise do; for each pixel position find the direction of rainfalling among the eight directions (direction of rainfalling is defined as the direction which has the smallest difference between the floating point activity of the central position (cp) and the other position (nwp; np; nep;wp; ep; swp; sp or sep), 6) Finally, the pixels that have the same label belong to the same region.An example for segmented image is shown in Fig. 2.This figure shows the suitability of the watershed segmentation algorithm.

Figure 1 :
Figure 1: A diagram showing central pixel and neighboring pixels.

Figure 2 :
Figure 2: An example for segmented image form rotation sequence.