Towards No-Reference of Peak Signal to Noise Ratio Estimation Based on Chromatic Induction Model

The aim of this work is to define a no-referenced perceptual image quality estimator applying the perceptual concepts of the Chromatic Induction Model The approach consists in comparing the received image, presumably degraded, against the perceptual versions (different distances) of this image degraded by means of a Model of Chromatic Induction, which uses some of the human visual system properties. Also we compare our model with an original estimator in image quality assessment, PSNR. Results are highly correlated with the ones obtained by PSNR for image (99.32% Lenna and 96.95% for image Baboon), but this proposal does not need an original image or a reference one in order to give an estimation of the quality of the degraded image.


INTRODUCTION
The early years of the 21st century have witnessed a tremendous growth in the use of digital images as a means for representing and communicating information.A significant literature describing sophisticated theories, algorithms, and applications of digital image processing and communication has evolved.A considerable percentage of this literature is devoted to methods for improving the appearance of images, or for maintaining the appearance of images that are processed.Nevertheless, the quality of digital images processed or otherwise, is rarely perfect.Images are subject to distortions during acquisition, compression, transmission, processing, and reproduction.To maintain, control, and enhance the quality of images, it is important for image acquisition, management, communication, and processing systems to be able to identify and quantify image quality degradations.The development of effective automatic image quality assessment systems is a necessary goal for this purpose.Yet, until recently, the field of image quality assessment has remained in a nascent state, awaiting new models of human vision and of natural image structure and statistics before meaningful progress could be made.
Nowadays, Mean Squared Error (MSE) is still the most used quantitative performance metrics and several image quality measures are based on it, being Peak Signal-to-Noise Ratio (PSNR) the best example.But some authors like Wang and Bovik in [1], [2] consider that MSE is a poor algorithm, to be used in quality assessment systems.

Therefore it is important to know what the MSE is and
what is wrong with it, in order to propose new metrics that fulfills the properties of human visual system and keeps the favorable features that the MSE has.
In this way, let and represent two images being compared and the size of them is the number of intensity samples or pixels.Being the original reference image, which has to be considered with perfect quality, and a distorted version of , whose quality is being evaluated.Then, the MSE and the PSNR are, respectively, defined as: and where is the maximum possible intensity value in (M x N size).Thus, for gray-scale images that allocate 8 bits per pixel (bpp) . For color images the PSNR is defined as in the Equation 2, whereas the color MSE is the mean among the individual MSE of each component.An important task in image compression systems is to maximize the correlation among pixels, because the higher correlation at the preprocessing, the more efficient algorithm postprocessing.Thus, an efficient measure of image quality should take in to account the latter feature.In contrast to this, MSE does not need any positional information of the image, thus pixel arrangement is ordered as a one-dimensional vector.Both MSE and PSNR are extensively employed in the image processing field, since these metrics have favorable properties, such as:  A convenient metrics for the purpose of algorithm optimization.For example in JPEG2000, MSE is used both in Optimal Rate Allocation [3], [4] and Region of interest [5], [4].Therefore MSE can find solutions for these kind of problems, when is combined with the instruments of linear algebra, since it is differentiable.
 By definition MSE is the difference signal between the two images being compared, giving a clear meaning of the overall error signal energy.www.ijacsa.thesai.org

A. Full Reference (FR)
Bottom-Up Approaches: Psychological and physiological studies in the past century have gained us a tremendous amount of knowledge about the human visual system (HVS).Still, although much is known about the mechanisms of early, front-end vision, much more remains to be learned of the later visual pathways and the general higher level functions of the visual cortex.While the knowledge is far from complete, current models of visual information processing mechanisms have become sufficiently sophisticated that it is of interest to explore whether it is possible to deploy them to predict the performance of simple human visual behaviors, such as image quality evaluation.Bottom up approaches to image quality assessment are those methods that attempt to simulate well modeled functionalities of the HVS, and integrate these in the design of quality assessment algorithms that, hopefully, perform similar to the HVS in the assessment of image quality.In this chapter we begin with a brief description of relevant aspects of the anatomy and psychophysical features of the HVS.This description will focus on those HVS features that contribute to current engineering implementations of perceptual image quality measures.Most systems that attempt to incorporate knowledge about the HVS into the design of image quality measures use an error sensitivity framework, so that the errors between the distorted image and reference image are perceptually quantized according to HVS characteristics.

Top-Down Approaches:
The bottom-up approaches to image quality assessment described in the last subsection (II-A1) attempt to simulate the functional components in the human visual system that may be relevant to image quality assessment.The underlying goal is to build systems that work in the same way as the HVS, at least for image quality assessment tasks.By contrast, the top-down systems simulate the HVS in a different way.These systems treat the HVS as a black box, and only the input output relationship is of concern.A top-down image quality assessment system may operate in a manner quite different from that of the HVS, which is of little concern, provided that it successfully predicts the image quality assessment behavior of an average human observer.One obvious approach to building such a top-down system is to formulate it as a supervised machine learning problem, as illustrated in Fig. 1.Here the HVS is treated as a black box whose inputoutput relationship is to be learned.The training data can be obtained by subjective experimentation, where a large number of test images are viewed and rated by human subjects.The goal is to train the system model so that the model prediction is minimized.This is generally a regression or function approximation problem.Many techniques are available to attack these kinds of problems.Unfortunately, direct application of this method is problematic, since the dimension of the space of all images is the same as the number of pixels in the image.Furthermore, subjective testing is expensive and a typical extensive subjective experiment would be able to include only several hundred test image shardly an adequate coverage of the image space.Assigning only a single sample at each quadrant of a ten dimensional space requires a total of 1024 samples, and the dimension of the image space is in the order of thousands to millions; An excellent example of the problem of dimensionality.
One method that might be useful to overcome this problem is by dimension reduction.The idea is to map the entire image space onto a space of much lower dimensionality by exploiting knowledge of the statistical distribution of typical images in the image space.Since natural images have been found to exhibit strong statistical regularities, it is possible that the cluster of typical natural images may be represented by a low dimensional manifold, thus reducing the number of sample images that might be needed in the subjective experiments.
However, dimension reduction is no trivial task.Indeed, no dimension reduction technique has been developed to reduce the dimension of natural images to 10 or less (otherwise, extremely efficient image compression techniques would have been proposed on the basis of such reduction).Consequently, using a dimension reduction approach for general purpose image quality assessment remains quite difficult.Nonetheless, such an approach may prove quite effective in the design of application specific quality assessment systems, where the types of distortions are fixed and known and may be described by a small number of parameters.

B. No-Reference (NR)
No-reference (NR) image quality assessment is, perhaps, the most difficult (yet conceptually simple) problem in the field of image analysis.By some means, an objective model must evaluate the quality of any given real world image, without referring to an original high quality image.On the surface, this seems to be a mission impossible.How can the quality of an image be quantitatively judged without having a numerical model of what a good/bad quality image is supposed to look like?Yet, amazingly, this is quite an easy task for human observers.Humans can easily identify high quality images versus low quality images, and, furthermore, they are able to point out what is right and wrong about them without seeing the original.Moreover, humans tend to agree with each other to a pretty high extent.For example, without looking at the original image, probably every reader would agree that the noisy, blurry, and JPEG2000 compressed images in Fig. 2 have lower quality than the luminance shifted and contrast stretched images.
Before developing any algorithm for image quality assessment, a fundamental question that must be answered is what source of information can be used to evaluate the quality of images.Clearly, the human eyebrain system is making use of a very substantial and effective pool of information about images in making subjective judgments of image quality.www.ijacsa.thesai.orgThree types of knowledge may be employed in the design of image quality measures: knowledge about the original high quality image, knowledge about the distortion process, and knowledge about the human visual system (HVS).In FR quality assessment, the high quality original image is known a priori.In NR quality assessment, however, the original image is absent, yet one can still assume that there exists a high quality original image, of which the image being evaluated is a distorted representation.It is also reasonable to make a further assumption that such a conjectured original image belongs to the set of typical natural images.
It is important to realize that the cluster of natural images occupies an extremely tiny portion in the space of all possible images.This potentially provides a strong prior knowledge about what these images should look like.Such prior knowledge could be a precious source of information for the design of image quality measures.Models of such natural scenes attempt to describe the class of high quality original images statistically.Interestingly, it has been long conjectured in computational neuroscience that the HVS is highly adapted to the natural visual environment, and that, therefore, the modeling of natural scenes and the HVS are dual problems.
Knowledge about the possible distortion processes is another important information source that can be used for the development of NR image quality measures.For example, it is known that blur and noise are often introduced in image acquisition and display systems and reasonably accurate models are sometimes available to account for these distortions.Images compressed using block based algorithms such as JPEG often exhibit highly visible and undesirable blocking artifacts.Wavelet based image compression algorithms operating at low bit rates can blur images and produce ringing artifacts near discontinuities.Of course, all of these types of distortions are application dependent.An application specific NR image quality assessment system is one that is specifically designed to handle a specific artifact type, and that is unlikely to be able to handle other types of distortions.The question arises, of course, whether an application specific NR system is truly reference free, since much information about the distorted image is assumed.However, nothing needs to be assumed about the original image, other than, perhaps models derived from natural scene statistics or other natural assumptions.Since the original images are otherwise unknown, we shall continue to refer to more directed problems such as these as application specific NR image quality assessment problems.
Of course, a more complex system that includes several modes of artifact handling might be constructed and that could be regarded as approaching general purpose NR image quality assessment.Before this can happen, however, the various components need to be designed.Fortunately, in many practical application environments, the distortion processes involved are known and fixed.The design of such application specific NR quality assessment systems appears to be much more approachable than the general, assumption free NR image quality assessment problem.Very little, if any, meaningful progress has been made on this latter problem.Owing to a paucity of progress in other application specific areas, this work mainly focuses on NR image quality assessment methods, which are designed for assessing the quality of compressed images.In particular, attention is given to a spatial domain method and a frequency domain method for block based image compression, and a wavelet domain method for wavelet based image compression.

A. Chromatic Induction Wavelet Model (CIWaM)
The Chromatic Induction Wavelet Model (CIWaM) [6] is a low-level perceptual model of the HVS.It estimates the image perceived by an observer at a distance d just by modeling the perceptual chromatic induction processes of the HVS.That is, given an image and an observation distance d, CIWaM obtains an estimation of the perceptual image that the observer perceives when observing at distance d.CIWaM is based on just three important stimulus properties: spatial frequency, spatial orientation and surround contrast.These three properties allow unifying the chromatic assimilation and contrast phenomena, as well as some other perceptual processes such as saliency perceptual processes [7].
The CIWaM model takes an input image and decomposes it into a set of wavelet planes of different spatial scales s (i.e., spatial frequency ) and spatial orientations o.It is described as: where n is the number of wavelet planes, is the residual plane and o is the spatial orientation either vertical, horizontal or diagonal.www.ijacsa.thesai.orgThe perceptual image is recovered by weighting these wavelet coefficients using the extended Contrast Sensitivity Function (e-CSF, Fig. 3).The e-CSF is an extension of the psychophysical CSF [8] considering spatial surround information (denoted by r), visual frequency (denoted by , which is related to spatial frequency by observation distance) and observation distance (d).Perceptual image can be obtained by where is the e-CSF weighting function that tries to reproduce some perceptual properties of the HVS.The term can be considered the perceptual wavelet coefficients of image when observed at distance d and is written as: This function has a shape similar to the e-CSF and the three terms that describe it are defined as: Non-linear function and estimation of the central feature contrast relative to its surround contrast, oscillating from zero to one, defined by: being and the standard deviation of the wavelet coefficients in two concentric rings, which represent a centersurround interaction around each coefficient.
Weighting function that approximates to the perceptual e-CSF, emulates some perceptual properties and is defined as a piecewise Gaussian function [8], such as: Term that avoids function to be zero and is defined by: taking and .Both and depend on the factor , which is the scale associated to 4 cycles per degree when an image is observed from the distance d with a pixel size and one visual degree, whose expression is defined by Equation 9.Where value is associated to the e-CSF maximum value

B. Basics
In the no-referenced image quality issue, there is only a distorted version that is compared with , being a distortion model and the unknown original image is considered a pattern like a chessboard (Figs. 5) with the same size of .The difference between these two images depends on the features of the distortion model .For example, blurring, contrast change, noise, JPEG blocking or JPEG2000 wavelet ringing.www.ijacsa.thesai.orgIn Fig. 2, the images Babbon and Splash are compressed by means of JPEG2000.These two images have the same PSNR=30 dB when compared to their corresponding original image, that is, they have the same numerical degree of distortion (i.e. the same objective image quality PSNR).But, their subjective quality is clearly different, showing the image Baboon a better visual quality.Thus, for this example, PSNR and perceptual image quality has a small correlation.On the image Baboon, high spatial frequencies are dominant.A modification of these high spatial frequencies by induces a high distortion, resulting a lower PSNR, even if the modification of these high frequencies are not perceived by the HVS.In contrast, on image Splash, mid and low frequencies are dominant.Modification of mid and low spatial frequenciesalso introduces a high distortion, but they are less perceived by the HVS.Therefore, correlation of PSNR against the opinion of an observer is small.Fig. 6 shows the diagonal high spatial frequencies of these two images, where there are more high frequencies in image Baboon.If a set of distortions is generated and indexed by k (for example, let _ be a blurring operator), the image quality of evolves while varying k, being k, for example, the degree of blurring.Hence, the evolution of depends on the characteristics of the original .
Thus, when increasing k, if contains many high spatial frequencies the PSNR rapidly decreases, but when low and mid frequencies predominated PSNR slowly decreases.
Similarly, the HVS is a system that induces a distortion on the observed image , whose model is predicted by CIWaM.Hence, CIWaM is considered a HSV particular distortion model that generates a perceptual image from an observed image , i.e .Therefore, a set of distortions is defined as , being d the observation distance.That is, a set of perceptual images is defined which is considered a set of perceptual distortions of the hypothetical image .
When image is observed at distance and this distance is reduced, the artifacts, if this possesses, are better perceived.In contrast, is observed from a far distance human eyes cannot perceive their artifacts, in consequence, the perceptual image quality of the distorted image is always high.The distance where the observer can perceive the best image quality of image is considered as the distance D.
Let and be an pattern image and a distorted image, respectively.NRPSNR methodology is based on finding a distance D, where there is no perpetual difference between the wavelet energies of the images and , when an observer observe them at d centimeters of observation distance.So measuring the PSNR of at D will yield a fairer and No-reference perceptual evaluation of its image quality.
NRPSNR algorithm is divided in five steps, which is summarized by the Figure 7 and described as follows:  Step 1: Wavelet Transformation Forward wavelet transform of images and is performed using Eq. 3, obtaining the sets and , respectively.The employed analysis filter is the Daubechies 9tap/7-tap filter (Table I).Step 2: Distance D The total energy measure or the deviation signature [9] is the absolute sum of the wavelet coefficient magnitudes, defined by [10] where x(m; n) is the set of wavelet coefficients, whose energy is being calculated, being m and n the indexes of the coefficients.Basing on the traditional definition of a calorie, the units of are wavelet calories (wCal) and can also be defined by Eq. 10, since one wCal is the energy needed to increase the absolute magnitude of a wavelet coefficient by one scale.
From wavelet coefficients and the corresponding perceptual wavelet coefficients .and are obtained by applying CIWaM with an observation distance .Therefore, Equation 11expresses the relative wavelet energy ratio , which compares how different are the energies of the reference and distorted CIWaM perceptual images, namely and respectively, when these images are watched from a given distance .
Thus, the main goal of this step is to find , namely, at D is equal to , where the energy of the distorted images are the same than the energy of the pattern.Step 5: PSNR between perceptual images Calculate the PSNR between perceptual images and using Eq. 2 in order to obtain the No-Reference CIWaM weighted PSNR i.e. the NRPSNR.

IV. EXPERIMENTAL RESULTS
It is important to mention that NRPSNR estimates the degradation, thus, the smaller the better.In this section, we indistinctly use either NRPSNR or BPSNR, since NRPSNR is the blind version of PSNR, thus, NRPSNR performance is assessed by comparing the statistical significance of the images Lenna and Baboon, in addition to the Pearson correlation between NRPSNR and PSNR data.Thus, both PSNR and NRPSNR estimate that image at 0.05 bpp has higher distortion.When this experiment is extended computing the JPEG2000 distorted versions from 0.05 bpp to 3.00bpp (increments of 0.05 bpp, depicted at Figure 11), we found that the correlation between PSNR and NRPSNR is 96.95 %, namely for image Baboon for every 10,000 estimation NRPSNR misses only in 305 assessments.The NRPSNR assessment was tested in two well-known images, such as Lenna and Baboon.It is a well-correlated image quality method in these images for JPEG2000 distortions when compared to PSNR.Concretely, NRPSNR correlates with PSNR, on the average in 98.13%.It is possible to quantize a particular pixel while an algorithm of bit allocation is working, incorporating into embedded compression schemes such as EZW, SPIHT, JPEG2000 or Hi-SET [11].

Fig. 2 .
Fig. 2. 256 x 256 patches (cropped for visibility) of Images Baboon and Splash distorted by means of JPEG2000 compression, although both images have the same objective quality (PSNR=30dB), their visual quality is very different.

Fig. 3 .
Fig. 3. (a) Graphical representation of the e-CSF for the luminance channel.(b) Some profiles of the same surface along the Spatial Frequency axis for different center-surround contrast energy ratio values (r).The psychophysically measured CSF is a particular case of this family of curves (concretely for r = 1).

Fig. 6 .
Fig. 6.Diagonal spatial orientation of the first wavelet plane of Images (a) Baboon and (b)Splash distorted by JPEG2000 with PSNR=30dB.

Fig. 7 .
Fig. 7. Methodology for No-Reference PSNR weighting by means of CIWaM.Both Pattern and Distorted images are wavelet transformed.The distance D where the energy of perceptual images obtained by CIWaM are equal is found.Then, PSNR of perceptual images at D is calculated, obtaining the NRPSNR metrics.
NRPSNR www.ijacsa.thesai.org Perform the Inverse Wavelet Transform of and , obtaining the perceptual images and , respectively.The synthesis filter in

Fig. 8 .
Fig. 8. JPEG2000 Distorted versions of color image Lenna at different bit rates expressed in bits per pixel (bpp).(a) High Distortion, (b) medium Distortion and (c) Low Distortion.

Fig. 10 .
Fig. 10.JPEG2000 Distorted versions of color image Baboon at different bit rates expressed in bits per pixel (bpp).(a) High Distortion, (b) Medium Distortion and (c) Low Distortion.

Fig. 11 .
Fig. 11.Comparison of PSNR and NRPSNR (Blind-PSNR or BPSNR) for the JPEG2000 distorted versions of image Baboon V. CONCLUSIONS NRPSNR is a new metric for no-reference or blind image quality based on perceptual weighting of PSNR by using a perceptual low-level model of the Human Visual System (CIWaM model).The proposed NRPSNR metrics is based on five steps.