Quantization Table Estimation in Jpeg Images

— Most digital image forgery detection techniques require the doubtful image to be uncompressed and in high quality. However, most image acquisition and editing tools use the JPEG standard for image compression. The histogram of Discrete Cosine Transform coefficients contains information on the compression parameters for JPEGs and previously compressed bitmaps. In this paper we present a straightforward method to estimate the quantization table from the peaks of the histogram of DCT coefficients. The estimated table is then used with two distortion measures to deem images as untouched or forged. Testing the procedure on a large set of images gave a reasonable average estimation accuracy of 80% that increases up to 88% with increasing quality factors. Forgery detection tests on four different types of tampering resulted in an average false negative rate of 7.95% and 4.35% for the two measures respectively. I. INTRODUCTION Due to the nature of digital media and the advanced digital image processing techniques provided by image editing software, adversaries may now easily alter and repackage digital content forming an ever rising threat in the public domain. Hence, ensuring that media content is credible and has not been " retouched " is becoming an issue of eminent importance for both governmental security and commercial applications. As a result, research is being conducted for developing authentication methods and tamper detection techniques. Mainly, active authentication include digital watermarking and digital signatures, while passive methods tend to exploit inconsistencies that in the natural statistics of digital images occur as a result of manipulation.


I.
INTRODUCTION Due to the nature of digital media and the advanced digital image processing techniques provided by image editing software, adversaries may now easily alter and repackage digital content forming an ever rising threat in the public domain.Hence, ensuring that media content is credible and has not been "retouched" is becoming an issue of eminent importance for both governmental security and commercial applications.As a result, research is being conducted for developing authentication methods and tamper detection techniques.Mainly, active authentication include digital watermarking and digital signatures, while passive methods tend to exploit inconsistencies that in the natural statistics of digital images occur as a result of manipulation.
JPEG images are the most widely used image format, particularly in digital cameras, due to its efficiency of compression and may require special treatment in image forensics applications because of the effect of quantization and data loss.Usually JPEG compression introduces blocking artifacts and hence one of the standard approaches is to use inconsistencies in these blocking fingerprints as a reliable indicator of possible tampering [1].These can also be used to determine what method of forgery was used.Many passive schemes have been developed based on these fingerprints to detect re-sampling [2] and copy-paste [3,4].Other methods try to identify bitmap compression history using Maximum Likelihood Estimation (MLE) [5,6], or by modeling the distribution of quantized DCT coefficients, like the use of Benford's law [7], or modeling acquisition devices [8].Image acquisition devices (cameras, scanners, medical imaging devices) are configured differently in order to balance compression and quality.As described in [9,10], these differences can be used to identify the source camera model of an image.Moreover, Farid [11] describes JPEG ghosts as an approach to detect parts of an image that were compressed at lower qualities than the rest of the image and uses to detect composites.
In this paper we present a straightforward method for estimating the quantization table of single JPEG compressed images and bitmaps.We verify the observation that while ignoring error terms, the maximum peak of the approximated histogram of a DCT coefficient matches the quantization step for that coefficient.This can help in determining compression history, i.e. if the bitmap was previously compressed and the quantization table that was used, which is particularly useful in applications like image authentication, artifact removal, and recompression with less distortion.
After estimating the quantization table, both average distortion measure and blocking artifact measure are calculated based on the estimated table to verify the authenticity of the image.
All simulations were done on images from the UCID [12].Performance for estimating Q for single JEPG images was tested against two techniques that are relevant in how the quantization steps are acquired; MLE [5,6], and power spectrum [1].For the other abovementioned techniques (e.g.Benford's), they are said to work on bitmaps.Investigating performance for previously compressed bitmaps can be found in [13].The rest of the paper is organized as follows.In section 2 we begin with a brief review of the JPEG baseline procedure and then show how the quantization steps can be determined from the peaks of the approximated histogram of DCT coefficients.We also present the two distortion measure used in evaluation.Testing and performance evaluation are discussed and section 3, where we demonstrate the use of estimated quantization table with the distortion measures in classifying test images and exposing forged parts.Finally, section 4 is for conclusions.

I.
A STRAIGHTFORWARD APPROACH FOR QUANIZATION  ) , where at frequency (i,j), D is the DCT coefficient, Q is the (i,j) th entry in the quantization table, and X is the resulting quantized coefficient.
(3) C b C r interpolation.
(4) YC b C r to RGB color space conversion.
One of the most useful aspects in characterizing the behavior of JPEG compressed images is the histogram of DCT coefficients which typically has a Gaussian distribution for the DC component and a Laplacian distribution for the AC components [5,6].The quantized coefficients are recovered, in step (1) of the dequantizer above, as multiples of Q(i,j).Specifically, if X q (i,j) is a dequantized coefficient in the DCT domain, it can be expressed as kQ(i,j), where Q(i,j) is the (i,j) th entry of the quantization table, and k is an integer.The estimation of Q(i,j) is direct from the histogram of X q (i,j) but X q (i,j) is an intermediate result and is discarded after decompression.Theoretically, X q (i,j) can be recalculated as DCT(X q (i,j)) since IDCT is reversible.Nevertheless in reality, the DCT of an image block usually generates X * (i,j), which is not exactly X q (i,j), but an approximation of it.In our experiments, we show that Q(i,j) can also be directly determined from histogram of X * (i,j).Fig. 1(a) and (b) show a typical absolute discrete histogram of X (3,3) and X * (3,3) respectively, for all blocks of an image.Nonzero entries occur mainly at multiples of Q(3,3)=10.
There are two main sources of error introduced during the IDCT calculation, mainly rounding and clipping, to keep the pixel levels integral and within the same range as a typical 8-bit image (0-255).The decaying envelopes of the histograms in Fig. 1 are roughly Gaussian although have shorter tails, at which the approximation error The reason according to [6] is that as the rounding error for each pixel does not exceed 0.5, the total rounding error is bounded by where So, Γ can be modeled as a truncated Gaussian distribution in the range ±B and zero outside that range [6].Now if we closely observe the histogram of X * (i,j) outside the main lobe (zero and its proximity), we notice that the maximum peak occurs at a value that is equal to the quantization step used to quantize X q (i,j).This means that rounding errors has less significance and could be ignored in the estimation.On the other hand, clipping or truncation errors are more significant and cannot be compensated for.Hence in our experiments, we leave out saturated blocks when creating the histogram.Fig. 2 shows a test image compressed with quality factor 80, and the corresponding quantization table.Fig. 3(a) and (b) show H the absolute histograms of DCT coefficients of the image from Fig. 2(a) at frequencies (3,3) and (3,4), respectively.Notice that the maximum peak for (3,3) occurs at 6 which is equal to Q (3,3).Also for (3,4), the highest peak is at value 10, which corresponds to the (3,4) th entry of the quantization table.Because in JPEG compression, the brightness (the DC coefficient) or shading across the tile (the 3 lowest AC coefficients) must be reproduced fairly accurately, there is enough information in the histogram data to retain Q(i,j) for low frequencies.We have verified that the highest peak outside the main lobe corresponds to q, for all low frequency coefficients.http://ijacsa.thesai.org/And since the texture (middle frequency AC coefficients) can be represented less accurately, the histogram of these frequencies may not be a suitable candidate for our observation.Indeed, we found that, for coefficients highlighted in gray in Fig. 2(b), the maximum peak occurs at a value that does not match the specific quantization step.However, we investigated peaks at parts of H where error is minimal, i.e. outside ±B and concluded that the observation still applies with condition.The maximum peak above B, (that is when | X * (i,j)|>B) occurred at a value matching the Q(i,j), Fig. 3(c) and (d).For a particular frequency (i,j), it is possible that no peaks are detected outside the main lobe.This occurs with heavy compression when the quantization step used is large and hence X * (i,j) becomes small and sometimes quantized to zeros for all blocks.The histogram decays rapidly to zero showing no periodic structure.Hence we do not have enough information to determine Q(i,j).Table 1 shows the difference between estimated Q table using the above method, and the original table for two quality factors.The X's mark the "undetermined" coefficients.The next step is to use the estimated table to verify the authenticity of the image by computing a distortion measure and then comparing it to a preset threshold.One measure is the average distortion measure.This is calculated as a function of the remainders of DCT coefficients with respect to the original Q matrix: where D(i,j) and Q(i,j) are the DCT coefficient and the corresponding quantization table entry at position (i,j).An image block having a large average distortion value indicates that it is very different from what it should be and is likely to belong to a forged image.Averaged over the entire image, this measure can be used for making a decision about authenticity of the image.
Another measure is the blocking artifact measure, BAM [1], which is caused by the nature of the JPEG compression method.The blocking artifacts of an image block will change a lot by tampering and therefore, inconsistencies in blocking artifacts serve as evidence that the image has been "touched".It is computed from the Q table as: B(n) is the estimated blocking artifact for testing block n, D(i,j) and Q(i,j) are the same as in (2).

A. Estimation Accuracy
We created a dataset of image to serve as our test data.The set consisted of 550 uncompressed images collected from different sources (more than five camera models), in addition to some from the public domain Uncompressed Color Image Database (UCID), which provides a benchmark for image processing analysis [12].For color images, only the luminance plane is investigated at this stage.Each of these images was compressed with different standard quality factors, [50, 55, 60,  http://ijacsa.thesai.org/65, 70, 75, 80, 85, and 90].This yielded 550×9 = 4,950 untouched images.For each quality factor group, an image's histogram of DCT coefficients at one certain frequency was generated and used to determine the corresponding quantization step at that frequency according to section 2. This was repeated for all the 64 histograms of DCT coefficients.The resulting quantization table was compared to the image's known table and the percentage of correctly estimated coefficients was recorded.Also, the estimated table was used in equations ( 2) and (3) to determine the image's average distortion and blocking artifact measures, respectively.These values were recorded and used later to set a threshold value for distinguishing forgeries from untouched images.
The above procedure was applied to all images in the dataset.Table 2 shows the accuracy of the used method for each tested quality factor averaged over the whole set of images.It shows that quality factor of 75 gives a percentage of around 80%.This is reasonable as this average quality factor yields the best image quality-compression tradeoff and hence the histograms have enough data to accurately define the quantization steps.As the quality factor decreases, estimation accuracy drops steadily.This, as explained earlier, is due to heavy quantization and corresponding large steps used with lower qualities.Histograms convey no data to predict the compression values.For higher quality factors, it is predictable that performance tend to improve which is apparent in the rising values in Table 2. Nevertheless, notice the drop in estimation for very high quality factors (95 and 100).This is due to very small quantization steps.The peaks of the histogram are no longer distinguishable as "bumps" outside the zero vicinity, but rather show as quick swinging.Moreover, most of the lower steps for such high qualities have the values of 1 or 2, which are very close to zero (for QF=100, all entries of the Q table are 1's and no compression takes place).In our method, we remove zero and its neighborhood which are all the next lower points until we hit a mount again.These values removed before estimation causes our method to always fail to estimate a step size of 1.The only case we manage to record a 1 is when the histogram of 1 is larger than the histogram of 0 which sometimes occur within lower frequencies.As for higher frequencies, they often give erroneous results.One way to correct them is to threshold the number of entries in the resulting table having the value of 1.
If most low frequency steps are 1 then we consider QF = 100 and output the corresponding table of 64 ones.
To verify that a wide range of quantization tables, standard and non standard can be estimated, we created another image set of 100 JPEG images from different sources as our arbitrary test set.Each image's quantization table was estimated and the percentage of correctly estimated coefficients recorded.This gave an average percentage of correct estimation of 86.45%.
Maximum Likelihood methods for estimating Q tables [5][6], tend to search for all possible Q(i,j) for each DCT coefficient over the whole image which can be computationally exhaustive.Furthermore, they can only detect standard compression factors since they re-compress the image by a sequence of preset quality factors.This can also be a time consuming process.Other methods [1,8] estimate the first few (often first 3×3) low frequency coefficients and then search through lookup tables for matching standard tables.Ye et.al [1], proposed a new quantization table estimation based on the power spectrum, PS, of the histogram of DCT coefficients.They constructed a low-pass filtered version of the second derivative of the PS and found that the number of local minima plus one equals the quantization step.Only the first 32 coefficients are used in the estimation because high frequency DCT coefficients would be all zero when quantized by large step.The authors of that work did not provide filter specifications, and we believe through experimenting, that there are no unanimous low-pass filter parameters for all quality factors or for all frequency bands.This means that either we use different settings for each group of Q steps, or use one filter to get a few low frequencies and then retrieve the rest of the table through matching in lookup tables.We found that a 1×3 Gaussian filter with a large cutoff frequency gave the best possible results when tested on a number of images.We used the filter to estimate the first nine AC coefficients and recorded the percentage of correct estimation.Tables 3  and 4 show the estimation time and accuracy of the MLE method and power spectrum method against our method for different quality factors averaged over 500 test images of size 640×480 from the UCID.While MLE requires double the time, the average time in seconds for the latter two methods is very close while the average accuracy of the power spectrum method using the specified filter was around 77%.We believe filter choice is crucial but since we could not optimize a fixed set of parameters, we did not investigate the method any further.

B. Forfery Detection
To create the image set used for forgery testing, we selected 500 images from the untouched image set.Each of these images was processed in a way and saved with different quality factors.More specifically, each image was subjected to four kinds of common forgeries; cropping, rotation, composition, and brightness changes.Cropping forgeries were Copy-paste forgeries were done by copying a block of pixels randomly from an arbitrary image and then placing it in the original image.Random values were added to every pixel of the image to simulate brightness change.The resulting fake images were then saved with the following quality factors [60, 70, 80, and 90].Repeating this for all selected images produced total of (500×4) × 4 = 8,000 images.Next, the quantization table for each of these images was estimated as above and used to calculate the image's average distortion, (2), and the blocking artifact, (3), measures, respectively.
The scattered dots in Fig. 4 show the values of the average distortion for 500 untouched images (averaged for all quality factors for each image) while the cross marks show the average distortion values for 500 images from the forged dataset.As the figure shows, the values are distinguished to distortion measure and hence the values for forged images tend to cluster higher than those for untampered images.
Through practical experiments we tested the distortion measure for untouched images against several threshold values and calculated the corresponding false positive rate FPR (the number of untouched images deemed as forged.),i.e., the number of values above the threshold.Optimally, we aim for a threshold that gives nearly zero false positive.However, we had to take into account the false negatives (the number of tampered images deemed as untampered) that may occur when testing for forgeries.Hence, we require a threshold value keeping both FPR and the FNR low.But since we rather have an untampered show up as tampered, rather than the other way round, we chose a threshold that is biased towards false positive rate.We selected a vale that gave FPR of 12.6% and a lower FNR as possible for the different types of forgeries.The horizontal line marks the selected threshold τ = 30.Similarly, the same set of images was used with the BAM and the threshold was selected to be τ = 20, with a corresponding FPR of 6.8%.Fig. 5 shows the false negative rate (FNR) for the different forgeries at different quality factors.The solid line represents the FNR for the average distortion measure, while the dashed line is for the blocking artifact measure.Each line is labeled with the average FNR over all images.As expected, as QF increases, a better estimate of the quantization matrix of the original untampered image is obtained, and as a result the error percentage decreases.Notice that cropping needs to destroy the normal JPEG grid alignment in order to achieve high distortion and hence mark the image as possible fake.This is because if the picture happens to be aligned perfectly to the original grid after cropping, then the cropping forgery would go undetected in this case.Similarly, detecting copypaste forgery is possible since the pasted part fails to fit perfectly into the original JPEG compressed image.As a result, when the distortion metric is calculated, it exceeds the detection threshold.Charts show that the blocking artifact measure recorded a lower threshold and usually lower FNR than average distortion measure.Generally, the performance of the two measures is relatively close for brightened and rotated images.However, BAM is more sensitive to cropping and compositing since it works on the JPEG's "grid" and these two manipulations tend to destroy that natural grid.Brightness manipulated images are the most ones likely to go undetected as they leave the grid intact.http://ijacsa.thesai.org/pasted from the second image and hence marking the forged area.Apparently as the quality factor increases, detection performance increases.Moreover, if the forged image was saved as a bitmap, detecting inconsistencies becomes easier as no quantization, hence loss of data, takes place.This can help in establishing bitmap compression history.

III. CONCLUSIONS
We showed in this paper that while ignoring quantization rounding errors, we still can achieve reasonably high quantization table estimation accuracy through computing a histogram once for each DCT coefficient.The maximum peak method, although straightforward gives good estimation results while neglecting rounding error.Hence, this reduces the need to statistically model rounding errors and hence reduces computations and time.It was tested against MLE method that models round off errors as modified Gaussian, and proved to require half the time with no degraded accuracy, if not better for some quality factors.
We have found through extensive test that the method estimates all low frequencies in addition to a good percentage of the high frequencies.Hence, this reduces the need for lookup tables and matching overtime as a large percentage of the table can be reliably estimated directly from the histogram (even some high frequencies).And by "large percentage" we mean enough entries to compute the distortion measure correctly without further searching in lookup tables.Also this means that arbitrary step sizes can be estimated which are often used in different brands of digital cameras.
The method was tested against the power spectrum method and proved to require nearly the same estimation time with improved accuracy.However, eliminating the need for lookup tables will naturally affect execution time since we will have to process all 64 entries not just the first 9.
Nevertheless, for images heavily compressed, the histogram fails to estimate high frequencies.In this case, we can always estimate the first few low frequency coefficients and then search lookup tables for a matching Q table.Of course this works only for standard compression table.Also images with large homogenous areas may fail to give estimation if we choose to exclude uniform blocks when approximating the histogram.In addition, performance tends to drop when an image is further compressed with a different quality factor.In such cases, double quantization leaves its traces in the histogram and methods for estimating primary and secondary quantization tables can be used.
Maximum peak also works well for retrieving bitmaps previous compression tables and using them for forgery detection.

FUTURE WORK
Investigating the chroma planes and further testing on bitmaps and multiple compressions is due as future work.Also, after classifying an image, we require and approach that can be used to identify which type of manipulations the image underwent.
Estimation of color space quantization tables: We have so far addressed gray scale images and the luminance channel of color images.A further study of the two chroma channels and the histograms of their DCT coefficients, and hence suggestion of possible methods for estimating the chroma tables, are natural extension to this work.Double Quantization: Double compressed images contain specific artifacts that can be employed to distinguish them from single compressed images.When creating composites, the pasted portion will likely exhibit traces of a single compression, while the rest of the image will exhibit signs of double compression.This observation could in principle be used to identify manipulated areas in digital images.

Fig. 6 (
Fig. 6(a) and (b) show two untouched images that are used to make a composite image (c).Part of the car from the second images was copied and pasted into the first image and the result was saved with different compression factors.The resulting distortion measures for the composite image are shown in Fig. 6(d) through (g).The dark parts denote low distortion whereas brighter parts indicate high distortion values.Notice the highest values corresponding to the part

Fig. 5 .
Fig. 5. False negative rate for two distortion measures, calculated for different forgery types.

JPEG2000:Fig. 6 .
Fig. 6.Two test images (a) and (b) used to produce a composite image (c).For each QF (d) through (g), the left column figures represents the average distortion measure while the right column figures represents the blocking artifact measure for the image in (c).

TABLE ESTIMATION IN
JPEG IMAGESThe JPEG standard baseline compression for color http://ijacsa.thesai.org/photographs consists of four lossy transform steps and yields a compressed stream of data:

TABLE I .
DIFFERENCE BETWEEN ESTIMATED AND ORIGINAL Q.

TABLE II
deleting some columns and rows from the original image to simulate cropping from the left, top, right, and bottom.For rotation forgeries, an image was rotated by 270 o .

TABLE IV .
AVERAGE ESTIMATION TIME (FIRST 3×3) AGAINST OTHER METHODS FOR DIFFERENT QUALITY FACTORS.