A Robust Algorithm of Forgery Detection in Copy-Move and Spliced Images

The paper presents a new method to detect forgery by copy-move, splicing or both in the same image. Multiscale, which limits the computational complexity, is used to check if there is any counterfeit in the image. By applying one-level Discrete Wavelet Transform, the sharped edges, which are traces of cut-paste manipulation, are high frequencies and detected from LH, HL and HH sub-bands. A threshold is proposed to filter the suspicious edges and the morphological operation is applied to reconstruct the boundaries of forged regions. If there is no shape produced by dilation or no highlight sharped edges, the image is not faked. In case of forgery image, if a region at the other position is similar to the defined region in the image, a copy-move is confirmed. If not, a splicing is detected. The suspicious region is extracted the feature using Run Difference Method (RDM) and a feature vector is created. Searching regions having the same feature vector is called detection phase. The algorithm applying multiscale and morphological operation to detect the sharped edges and RDM to extract the image features is simulated in Matlab with high efficiency not only in the copymove or spliced images but also the image with both copy-move and splicing. Keywords—Forgery detection (FD); Copy-Move; Discrete Wavelet Transform (DWT); Run Difference Method (RDM); Splicing, Sharpness


INTRODUCTION
Image forgery detection is attracting the attention of scientists in computer vision, digital image processing, biomedical technology, investigation, forensics, etc.With popular and complicated technologies and powerful software tools in digital images, it is difficult to confirm if the image is original or not through naked eyes (see Figure 1).This challenges researchers to develop algorithms and propose methods to detect the forgery in image.Upon the survey on IEEE and Elsevier, the number of publications on image forgery detection from 2000 increases rapidly in 2010 and more in the following years [1].An image can be faked by changing any characteristics including brightness, darkness or image parameters,… or hiding information.Watermarking and digital signature are solutions of information security in which a security code is inserted in the image so these methods have information of a code and the original image.A question is asked in the case if there is no code or signature inserted or information of original image, how to confirm its authenticity.Blind/passive techniques in which the detection is done in the tested image itself without any prior information are developed to solve the problem given.Copy-move images from (a), (c) from [2] .(f), (h).Spliced images from (e), (g) by Photoshop.
According to [3], blind/passive techniques are grouped into two kinds: copy-move and splicing.The copy-move is defined by cutting an image region and pasting it to other place in the same image while splicing is understood by cutting an image region and pasting it to a different image.Based on this classification, searching the regions having similar features in copy-move images or completely different regions in spliced images is the principle of forgery detection.www.ijacsa.thesai.orgMany techniques are proposed and used in this field but actually, they can solve only problems on copy-move or splicing separately.The dataset in the previous publications often consists of copy-move images or spliced images, not both in images.This paper proposes a method which can detect the forgery in images not only for copy-move or splicing but also for both.The literature review and proposed method are presented in part II and III.Simulation results and conclusion are shown in the following parts.

II. LITERATURE REVIEW
This part summarizes some recent methods related to image forgery detection as an overview and also references from which the researchers have new ideas and solutions.For copymove detection, the searching of similar regions is the main purpose in almost all methods while the searching of inconsistencies of features is considered the solution in splicing detection.Although there are numerous related methods published, most of them solve problems of copy-move or splicing separately and only few papers can solve problems of both copy-move and splicing in the same image.Therefore, developing an algorithm to detect any forgery regions, not limited to copy-move or forgery, is still a challenge for scientists in the field of image forensics.

A. Copy-Move Forgery Detection
For copy-move detection, a survey in [3] covers and evaluates methods published until 2012 in which the duplicated regions are confirmed based on feature vectors comparisons.Feature vectors can be extracted directly from tested images or after applying a transformation such as DWT and DCT.The difference on feature extractions and the way to compare feature vectors comprise the variety in methods.After that, a new method to extract the image features by describing the spatial structure of the gray image texture called Local Binary Pattern (LBP) was introduced by Leida Li et al. [4] in 2013.In the case of color image, it should be first converted to gray image by using I=0.299R+0.587G+0.114B and low pass filter should be applied to obtain the low-frequency features which is more stable than the high-frequency ones.As the previous methods, the feature matching is defined based on the threshold.Moreover, the post-processing including a special designed filter and morphological operations is also considered in the process of detection.The method is robust to JPEG compression, noise contamination, blurring, rotation and flipping.However, it is difficult to detect the rotated regions with general angles.Investigation of invariant block features and appropriate selection of the dimension of features are suggested to improve the random rotation.
Using Undecimated Dyadic Wavelet Transform (UDWT) and Zernike moments is proposed as a new method to detect the forgery in copy-move images by Jiyun Yang [5] in 2013.In this paper, the applying UDWT is firstly used to collect the low frequencies (LL) components.Traditional ZMs is then computed to produce feature vectors of overlapping blocks on LL and find the copied regions from these vectors.Lexicographical sorting, correlation coefficients with a threshold value are used to find the similar vectors and limit the exact forged blocks from the groups of similar vectors obtained in lexicographical sorting step, respectively.
Blur invariants are also used to produce feature vectors in copy-move image forgery detection [6].Based on this idea, the LL sub-band from DWT of an image using Harr basis is divided into small overlapping blocks whose features are then represented by blur moment invariants.Each block feature vector consists of 24 blur invariants in case of grayscale images and 72 ones in the RGB and is reduced dimension by applying PCA.The block similarity analysis will detect the duplicated regions by considering the Euclidean distances and a userdefined threshold.This is applied to image with noise, blur and contrast changes.The applying other basis or DCT is suggested in the coming research.
In [7], an image is decomposed into four sub-bands using DWT in which the LL sub-band is considered for the coming steps.The proposed algorithm uses SIFT on each small overlapping block divided from LL sub-band to extract feature vector.These feature vectors are used to create a descriptor vector and compared to detect if there is a copy-move manipulation in the image.This method is checked with MICC-F200 database with high accuracy, less time, robust to scale and rotation.The authors of this algorithm developed SUFT (Speed-Up Robust Feature) to extract the features of image block instead of using SIFT as in [7].The combination of SUFT and DWT, DyWT are also presented.With the results obtained from the proposed method, SUFT is proved faster than SIFT while SIFT is mostly used to select the invariant features [8].

B. Splicing Forgery Detection
Splicing is more complex than copy-move, not only in the forgery manipulation but also in detection.The key idea of many splicing detection methods is searching regions being inconsistent with camera characteristics or image features.Regions which are resampled, double compressed, and those with blur discrepancies or sharpness differences can be considered traces of splicing.However, because of the variety of splicing, more and more algorithms have been developed in recent years.
Conditional Co-occurrence Probability Matrix (CCPM) is used to detect the splicing in image based on the third order statistical features [9].CCPM contains the discriminative information which are included in higher order statistical features and independent to the image features.However, the higher dimensionality of features is, the more complex computation is.Therefore, Principle Component Analysis (PCA) is also used to improve the computational complexity of the proposed method which is robust and better than Markov features both in spatial domain and block discrete cosine transform (BDCT) domain.
Rescaling and its factor are used to detect the forgery caused by splicing [10].A region copied from an image will be resized or scaled before pasting to the destination image.Scaling makes the pasted portion resampled and inconsistent.In addition, properties of the zero-crossing of the second difference are considered to calculate the scaling factor with different interpolation schemes.The algorithm of rescale detection and estimation was proposed clearly in five steps including pre-processing to convert the RGB to grayscale and extract Y component from YC b C r conversion; calculate the www.ijacsa.thesai.orgsecond difference, their zero-crossing and Discrete Fourier Transform (DFT) before searching for the periodicity and peak detection.
Differences of JPEG compression in an image can be caused by the splicing [11].JPEG forgery detection based on 8x8 block Discrete Cosine Transform (DCT) transform to detect the shift of DCT block alignment.The splicing detection was proposed by analyzing and suggesting solutions for cases making the differences in compression history including detections of Aligned Double JPEG, Non Aligned Double JPEG, Primary Quantization Table, JPEG ghost.
Illumination inconsistencies and intrinsic resampling properties are also parameters to detect the splicing [12].The first requires an input image and a database for training.The algorithm begins with 30x30 blocks which will be transformed into an opponent color space HSV before extracting features of contrast and mean.The contrast is calculated from the standard deviation while the mean is obtained by computing the average grey level.These features will decide suitable algorithm.Illumination color estimation, illumination map creation, Wavelet-based features extraction and classifier are the following steps of the proposed method by illumination differences detection.The second solution in this paper proposes a resampling detection scheme to detect forgery in which second difference in horizontal or vertical, Radon transform, FFT of covariance, high-pass filtering, feature extraction and classifier are included.

C. Forgery Detection for both Copy-Move and Splicing
An integrated technique, which combines DCT and Speeded Up Robust Features (SURF), to detect the image forgery in term of copy-move or splicing was proposed in 2011 [13].This means the tested images can be optional, not classified in copy-move or splicing in advance.The paper finds new traces based on recompression to detect the counterfeit of recompressed images.Periodicity analysis with double compression effect in both spatial and DCT domain is applied before using SURF descriptor to against the variation of rotation and scaling.The proposed method located the forgery regions efficiently for both copy-move and splicing image, especially, discriminated the positions of original and forged regions.
At the EUROCON 2013 in Croatia, a method can detect both copy-move and splicing in image using a multi-resolution Web Law Descriptor (WLD) was presented [14].The algorithm firstly converts a RGB image into YC b C r so that the WLD can extract the features from chrominance components which are less sensitive than luminance.To extract the features effectively, WDM are expressed by two components of differential excitation and orientation, which based on Weber'law.The multi-resolution WLD histogram is comprised of the histogram of three neighbors of (8,1), (16,2), and (24,3) where the first argument is the number of neighboring pixels and the second is the radius of the neighbors from the center pixel.A support vector machine, which involves to training and testing image, is used for classification purpose.

III. PROPOSED METHOD
The paper consists of two phases: the sharped edge detection and the copy-move/splicing detection.Before presenting the proposed method, this part firstly shows the related theories in brief including multiscale using DWT, edge detection, dilation and RDM in which the first three [15] are used in sharpness detection and the last is suggested for feature extraction in copy-move/splicing detection.

A. Multiscale using DWT
With a 2D image f(x,y), two dimension DWT will produce one separable scaling function (x,y) and three separable directionally sensitive wavelets  H (x,y),  V (x,y),  D (x,y) corresponding to variations along the horizontal edges, vertical edges and diagonals, respectively.These functions are defined in ( 1), ( 2), ( 3) and (4).
In DWT, a scaling function is used to create a series of approximation of an image and a factor of 2 in resolution defines the difference between its nearest neighboring approximations while the encoding of differences in information between adjacent approximations is obtained from wavelets.The scaled and translated basic functions are defined by ( 5) and ( 6) for all j, k  Z, m=n=0,1,2,…2 j -1.In (6), i={H,V,D} identifies the directional wavelets from (2), ( 2) and (4).Then discrete wavelet transform of image f(x,y) of size MxN is done by defining the approximation and directional coefficients as in (7) and (8).
where j0 is an arbitrary scale, W  (j 0 ,m,n) are approximation coefficients of image f(x,y) at scale j 0 and W i  (j,m,n) are coefficients used to add the horizontal, vertical and diagonal details for scale j  j 0 .
After applying DWT, an image is decomposed in approximation, horizontal, vertical and diagonal part (see Figure 2).The edges are high frequencies which are collection of details in part II, III and IV of the Figure .2.As the proposed method reduces the size of image by a half so one-level DWT is applied.The filter bank to create one level-analysis is shown in Figure 3. www.ijacsa.thesai.org

B. Egde Detection
Sharpness of edges can be traces of pasting information from other region.Therefore, edge detection is the first step to search the suspicious regions and the regions having edges with highest sharpness are collected, considered and tested [16].Laplacian operator is applied to the three sub-bands LH, HL and HH to select only edges for further processing steps by a convolution between each sub-band and a 3x3 Laplace kernel (see Figure 4).

C. Dilation for Filling Gaps
Ordinary, at positions of pasting, the borders will be smoothed by some software tools or Photoshop so not all of edges are detected continuously in LH, HL and HH.Therefore, dilation is proposed to bridge the gaps and make the boundary smooth, which helps to address the forged regions easier.
The dilation of two sets A and B in Z 2 is defined in ( 9) or in another form as in ( 10) where B ˆis the reflection of B and   z B ˆ is the shifting of B by z.
By applying each LH, HL and HH to A and the structuring element defined in Fig. 5 to B, the dilation is applied to repair gaps in boundaries which are maybe traces of cutting and pasting.[17] Run Difference Method (RDM) is a features extraction method in which features of size and prominence of texture elements are considered.From distribution of gray-level difference (DGD), RDM calculates five feature vectors including large difference emphasis, sharpness, the second moment of DGD, the second moment of distribution of the average gray level difference (DOD) and long distance emphasis.

D. Extract Features using Run Difference Method
With a rectangular gray image F in domain D of 2dimensional image plane, the relationship between F and D can be defined as in (11) and (12).
where N x and N y are horizontal and vertical dimensions of F; n g is number of gray levels in F and I is set of integers.
Let d be the displacement vector between two pixels (x 1 ,y 1 ) and (x 2 ,y 2 ), we have: d can be presented with distance r and direction  in polar coordinate as in ( 14) The Run Difference Matrix is defined as a function of r and gray level difference with the given direction  in ( 15) where # denotes the cardinality.The denominator N is a normalization factor which is equal to the total number of paired pixels.
From Run Difference Matrix as in Figure 6, three vectors including Distribution of Gray Level Difference (DGD), Distribution of Average Difference (DOD) in each row of RDM and Distribution of Average Distance (DAD) are defined in ( 17), ( 18) and ( 19) respectively.
where c is the maximum distance of r.
These five parameters are considered the five features of images and used in image features extraction.

E. Proposed Method
The paper proposes a method not only detecting the forgery but also defining the manipulation of forgery including copy-move, splicing or both in an arbitrary image without any prior information of the original image.In addition, the method can detect more than one forged regions in an image.The flowchart of the proposed method, which consists of the edge detection to confirm forgery and similar region searching to define the forgery manipulation, is split and represented in Figure 7 and Figure 8.In the first phase shown in Figure 7, a color image is converted to grayscale before applying one-level DWT decomposition.As edges are expressed by high frequencies, the three sub-bands LH, HL and HH are considered to detect the edges.Actually, there are many edges or boundaries in a real image so the collection of edges caused by pasting is required.The threshold is set up to the texture and layout of each image, and ranges from 50% to 80% of the maximum sharpness.The remaining edges after sharpening and filtering by thresholds in all three sub-bands of high frequencies are dilated to reconstruct edges or boundaries.To detect the cutting/pasting parts, the low frequencies in LL sub-band are ignored by setting them to zero.Therefore, Inverse discrete wavelet transform (IDWT) from these four sub-bands shows an image with only edges and boundaries.If there is any feasible shape covered by edges, this is causes by a pasting so a counterfeit is confirmed.The number of completed shapes is the number of forged regions.Otherwise, the image is original.
For every faked part in Figure 8, copy-move or splicing manipulation is confirmed by feature similarity detection.Blob detection is applied to define the size mxn of forged region.By dividing the tested MxN image in many overlapping mxn blocks, (M-m+1)(N-n+1) feature vectors are created by using Run Different Method.The algorithm detection uses Run Difference Method (RDM) extracts five features of the faked parts and searches regions having similar features.The results may be in there cases (i) copy-move if there is at least one other place having similar feature to the faked one, (ii) splicing if there is no similar region, (iii) both copy-move and splicing or more than one copy-move regions if there are at least two forged regions, the copy-move is defined as in (i) and the splicing is confirmed as in (ii).

IV. SIMULATION RESULT
The proposed algorithm is run in Matlab2013 by PC with processor Intel(R) Core ™ i5-2400 CPU@3.10GHz, RAM 4GB.The paper proposes an algorithm by using one-level DWT to address suspicious regions based on the edges with high sharpness from three sub-bands LH, Hl and HH.The copy-move, splicing or both manipulations in an image are detected by searching regions similar to the suspicious regions.The test images for testing copy-move forgery are collected from the benchmark data of research group in [2].The dataset for splicing and both copy-move and splicing are natural pictures and forged by Photoshop.Some results obtained from the proposed methods are shown in Figure 9.

Evaluation
The proposed method is evaluated based on three different data consisting of copy-move images, spliced images and both of copy-move and splicing in the same images.In the case of testing in copy-move images, the proposed algorithm is compared at image level to the Zernike moments [2], Undecimated Dyadic Wavelet Transform and Zernike Moments [18] and Discrete Wavelet Transform and Modified Zernike Moments [19] based on some images from benchmark_data [2] with results as in Table 1.
Three parameters called precision (p), recall (r) and F1 are used to evaluate the feasibility of the proposed method, which defined in (25), ( 26) and ( 27) [2].Precision is the probability of the exact forgery detection while recall is the probability of forged image detection.F1 is obtained by considering both precision and recall.To compare the efficiency between related methods, these parameters are calculated at image level and also in the set of copy-move images.where T P , F P and F N are the number of true forged pixels, false forged pixels and miss forged pixels, respectively.www.ijacsa.thesai.orgThe efficiency of the proposed method for images with splicing or both copy-move and splicing, which are forged by Photoshop, are shown in Fig. 9(f), Fig. 9(h) and Fig.

V. CONCLUSION
The paper proposes a method to detect the forgery manipulations in images including copy-move or splicing or both.A counterfeit is firstly defined from the sharpness of edges and boundaries presented by high frequencies at the three sub-bands LH, HL and HH of one-level DWT decomposition which are traces of cutting and pasting.When a fake is confirmed, suspicious regions becomes objects to be considered.Through the blob detection, the size of suspicious parts are defined and the searching other places having similar RDM to these will classify the forgery of copy-move, splicing or both.The fact that tested images can be optional instead of limiting on copy-move or splicing is the novelty of proposed method.To evaluate the efficiency and feasibility of our www.ijacsa.thesai.orgmethod, the algorithm is tested in three different kinds of images in which the first kind is copy-move images from benchmark_data [2] and two remains are spliced images and copy-move/spliced images by Photoshop with good results.Applying a Canny filter with suitable coefficients instead of using the morphological operation to limit the changes on energy of images can be considered in the coming research.

Fig. 6 .
Fig. 6.Run Difference Matrix Based on three vectors DGD, DOD, and DAD, five features including Large Difference Emphasis (LDE), Sharpness (SHP), Second Moment of DGD (SMG), Second Moment of DOD (SMO) and Long Distance Emphasis (LDEL) are also defined.

TABLE I .
RESULTS FOR COPY-MOVE IMAGE DETECTION AT IMAGE LEVEL (%) IN CASE OF COPY-MOVE AT ONE PLACE AND FEW PLACES