Medical Image Fusion Algorithm based on Local Average Energy-Motivated PCNN in NSCT Domain

Medical Image Fusion (MIF) can improve the performance of medical diagnosis, treatment planning and image-guided surgery significantly through providing highquality and rich-information medical images. Traditional MIF techniques suffer from common drawbacks such as: contrast reduction, edge blurring and image degradation. Pulse-coupled Neural Network (PCNN) based MIF techniques outperform the traditional methods in providing high-quality fused images due to its global coupling and pulse synchronization property; however, the selection of significant features that motivate the PCNN is still an open problem and plays a major role in measuring the contribution of each source image into the fused image. In this paper, a medical image fusion algorithm is proposed based on the Non-subsampled Contourlet Transform (NSCT) and the Pulse-Coupled Neural Network (PCNN) to fuse images from different modalities. Local Average Energy is used to motivate the PCNN due to its ability to capture salient features of the image such as edges, contours and textures. The proposed approach produces a high quality fused image with high contrast and improved content in comparison with other image fusion techniques without loss of significant details on both levels: the visual and the quantitative. Keywords—Medical image fusion; pulse-coupled neural network; local average energy; non-subsampled contourlet transform


INTRODUCTION
A numerous imaging modalities such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Ultrasound, Positron Emission Tomography (PET), and Single Photon Emission Computed Tomography (SPECT) reflect information about the human body from different views.For example, CT can reflect the anatomical structure of bone tissues clearly, while the MRI can reflect the anatomical structure of the soft tissues, organs and blood vessels.The nature of clinical diagnosis and treatment requires a composite view of two or more modalities, since using a single source of information may not be sufficient to localize lesions and abnormalities during the diagnosis process [1].Thus, a way is needed to extract and combine information from different modalities to produce clear and rich-information images to provide more reliable and accurate diagnosis.Combining such information manually is time consuming, subject to human error and based on radiologist's experience which may produce misleading results.
The art of combining complementary information automatically from different medical source images for the same organ/tissue being imaged is known as medical image fusion.A major prerequisite should be fulfilled for the fusion process to perform correctly; it is the registration/alignment of the medical source images to be fused.Any fusion scheme should fulfill some generic requirements: First, all the salient features and significant information in the source images should be present in the fused result.Second, no artifacts or unwanted degradations should be introduced by the fusion process.Third, irrelevant features and noise should be discarded and minimized [2].
The core problem of medical image fusion is how to find an efficient way of measuring the contribution of each source image into the resultant fused image which turns the medical image fusion problem into an analysis problem [3].Medical image fusion can be decomposed into two major steps: measurement of activity level and applying a suitable fusion rule.Activity level refers to the local energy or the amount of information present in an image pixel or coefficient [4].It can be measured for a single pixel value or by taking into consideration the surrounding neighbors of the pixel.On the other hand, fusion rules should be selected carefully depending on the nature of the source images to be fused.The most common fusion rules are Min, Max and Average.www.ijacsa.thesai.orgPCNN is an artificial neuron model inspired from the visual cortex of the cat.It is characterized by the global coupling and pulse synchronization of neurons, this means that the neurons corresponding to pixels with similar significance tend to fire synchronously.These characteristics of the PCNN make it appropriate for activity level measurement.NSCT is a modified version of the original contourlet transform; it overcomes the Pseudo-Gibbs phenomena because of its shiftinvariant characteristic.This characteristic fulfills two major generic requirements of image fusion process: (a) no artifacts or inconsistencies should be introduced in the fused result and (b) the fusion process should be shift invariant.
A variety of medical image methods has evolved across the recent years.Fig. 1 shows the major categories by which medical image methods can be classified.Pixel-level spatial domain techniques such as simple averaging, knowledge based image fusion [5,6] usually lead to contrast reduction and edge blurring.Pyramidal fusion methods including the laplacian pyramid [7], gradient pyramid [8], ratio-of-low-pass pyramid and the morphological pyramid [9] fail to capture the spatial orientation in the decomposition process; hence cause blocking effects [10].Mathematical methods including principal component analysis [11,12], intensity-hue saturation [13,14] and the Brovey transform [15] offer better results, but suffer from spectral degradation [16].
Several Image Fusion (IF) and Medical Image Fusion (MIF) techniques based on PCNN have been proposed by researchers [15][16][17][18][19][20].The majority of the MIF techniques based on PCNN use the normalized single value of the pixel in the spatial domain or the coefficient in the transform domain as the feeding input to the PCNN which leads to contrast reduction and loss of directional information respectively [19,[21][22][23][24].Moreover, using a single pixel/coefficient value as stimuli for a PCNN neuron is not effective, since the human visual system is more sensitive to the variations in images such as edges, contours and directional features.
Das and Kundu [17] employed a Neuro-fuzzy approach which combines a reduced pulse coupled neural network with fuzzy logic in order to produce fused image with higher contrast, more clarity and more useful subtle detailed information.Kavitha and Chellamuthu [18] enhanced the input before feeding it into the PCNN using the ant colony optimization (ACO) technique.Das and Kundu [16] proposed a modified spatial frequency motivated PCNN to fuse the high frequency sub-bands and max selection fusion rule to fuse the low frequency sub-bands.Xiao-Bo et al. [20] proposed a spatial frequency motivated pulse coupled neural network to fuse low and high frequency sub-bands.It works well for multi-focus IF and visible/infrared IF, but the absence of directional information in SF and using the same fusion rule for both the sub-bands cause contrast reduction and loss of image details [16].Wang and Ma [19] proposed an image fusion technique based on a modified model of the pulsecoupled neural network; it is called the m-PCNN where m is the number of external input channels.Data fusion happens in the internal activity of the neuron.The process of fusion is completely carried out by the PCNN and the number of Intensity-Hue Saturation (IHS) Brovey Transform www.ijacsa.thesai.orgchannels can be extended dynamically to fuse more than two images; however, using the normalized gray value of the input image as an input to the PCNN will lead to contrast reduction and edge blurring.
In this paper, a NSCT-based MIF algorithm using local average energy as a feeding input to motivate the PCNN neurons is proposed.Input source images to be fused are assumed to be well-aligned.Rest of the paper is organized as follows: NSCT, simplified model of the PCNN and the proposed MIF scheme are described in the Methodology.Experimental results and discussion are described in Results and discussion section.Finally, conclusions and future work are summarized in Conclusion section.

A. Non-Subsampled Contourlet Transform
NSCT is a shift-invariant version of the original contourlet transform proposed by Da Cunha et al. [25] to overcome the contourlet transform limitations.The original contourlet transform lacks shift-invariant characteristic due to downsamplers and up-samplers introduced in both the Laplacian Pyramid (LP) and the Directional Filter Bank (DFB).The absence of shift invariance in the contourlet transform causes pseudo Gibbs phenomena around singularities [20].In the original contourlet [26], the Laplacian pyramid is firstly applied to capture the point discontinuities and then is followed by a directional filter bank to connect point discontinuities into linear structures [20].The NSCT is mainly divided into two building blocks: the shift-invariant pyramid filter bank and shift-invariant directional filter bank as shown in Fig. 1(a).The decomposition of an input image into frequency sub-bands using the NSCT is illustrated in Fig. 1(b).
The shift-invariant pyramid filter bank is responsible for the sub-bands decomposition.It maintains the multiscale property of the NSCT by using two-channel, non-subsampled filter banks applied iteratively to obtain the multiscale decomposition.The Non-subsampled directional filter bank is used to achieve the multi-direction property of the NSCT.Upsamplers and down-samplers are used to a minimum extent in the Directional Filter Bank by switching them off in every twochannel filter bank in the DFB tree structure and up-sampling the filters accordingly [25].
In our proposed scheme, the decomposition parameters are set to levels = [1,2,4].The pyramidal filter is set to 'pyrexc' and the directional filter is set to 'vk' in the NCST configuration.The frequency sub-bands obtained after applying the NSCT size are equivalent to the size of the original source images which means that each frequency coefficient corresponds to the pixel of same location in the spatial domain; this characteristic guides the selection of a suitable fusion rule for each sub-band.

B. Simplified Pulse -Coupled Neural Network
PCNN is a 2D single layer, laterally connected network of pulse-coupled neurons, with a 1:1 correspondence between the image pixels and network neurons [27].No training is required for the PCNN.The three main components of the PCNN are: the receptive field, modulation field and pulse generator as shown in Fig. 3.The output of each neuron is one of two states: firing or non-firing.A firing map is then generated by accumulating each neuron firing times.Firing times of each neuron can be used as an activity level measurement, where the neuron of larger firing times indicates the significance of the corresponding coefficient.PCNN has several parameters with complex structures and an optimal setting of these parameters is a major limitation to automation and generalization of PCNN [17], that's why a reduced model of the pulse coupled neural www.ijacsa.thesai.orgnetwork is used instead.The equations of the reduced PCNN model are described below through Eq.(1) to Eq.( 6).
The indices i and j refer to the pixel/coefficient location in the image/sub-band, k and l refer to the displacement of the symmetric weights kernel around the image pixel and n refers to the current iteration.and are the feeding and linking input respectively.
is the kernel weights and is the external stimulus that motivates the neuron.
is the internal activity of the neuron and β is the linking strength parameter.Yi,j[n] is the output of the neuron after applying the threshold to the internal activity.
is the dynamic threshold, where and are normalized constant and time constant respectively.

C. Proposed Approach
Local average energy reflects information about the presence of image variations such as edges, contours and textures, that's why it would be more expressive if the local average energy is used in place of the single pixel/coefficient value as a motivation to the PCNN.Our transform-based approach employs the local average energy to motivate the PCNN in order to measure the contribution of each source image into the fused result.The shift-invariant NSCT is employed to decompose the source images into frequency subbands.It is mainly divided into two major steps: highfrequency sub-bands fusion and low frequency sub-bands fusion.The block diagram of our proposed approach is shown in Fig. 4.

1) Low frequency sub-bands fusion:
Max selection fusion rule is applied directly to the absolute value of the LFSs coefficients.The coefficient with higher absolute value is selected as the fused image coefficient.
 2) High frequency sub-bands fusion: Since the human visual system is sensitive to image variations such as edges, contours and textures, choosing the absolute value of the coefficient as input to the PCNN may not be the wise choice.Using features rather than raw data or single values, whether pixel values or frequency coefficients as an input to motivate the PCNN neurons, will be more accurate.Furthermore, it will act as an indicator of the significance of each source image.Local average energy is used as the image features that will motivate the neurons.PCNN is employed as an activity level measurement.For each high frequency sub-band, the local average energy is calculated as follows: 8) www.ijacsa.thesai.org is then used as input to the PCNN.After running the PCNN for several iterations, firing times for each neuron is calculated.The generated firing maps are used to select which coefficient will contribute to the fused result.
3) The proposed MIF algorithm steps: a) Decompose the pre-registered source images into low/high frequency sub-bands using NSCT, each sub-band size is equivalent to the size of the source images.
b) Apply max selection rule to the Low-Frequency Subbands (LFSs) as described by Eq.( 7).c) Calculate the local average energy for each High-Frequency Sub-band (HFS) as described by Eq.( 8) using a slipping window over each HFS coefficients.
d) Motivate the PCNN using the local average energy calculated for every HFS, then calculate the output of each neuron using Eqs.( 1) to ( 5) and generate the firing maps by Eq.( 6).e) Apply the high frequency fusion rule based on the neurons firing maps.Coefficients that correspond to the neurons with higher firing times are selected to contribute in the resultant fused image as illustrated by Eq. ( 9).
Apply the inverse NSCT to obtain the fused image.

III. RESULTS AND DISCUSSION
The proposed algorithm was implemented using MATLAB.Source images are of size 256 x 256.The PCNN parameters were configured to k x l = 3 x 3, W = [0.707 1 0.707; 1 0 1; 0.707 1 0.707], β = 0.2, and the sliding window of the local average energy = 3 x 3. To evaluate the quality of the output fused images, the following quality metrics are used:

A. Entropy
Entropy is a measure of the information content present in an image.It is described by the equation:

B. Standard Deviation (STD)
Standard deviation is used to measure the image contrast, where a higher standard deviation value indicates better contrast.

C. Mutual Information (MI)
A measure of how much information is mutual between two images.Given image A and B, the mutual information preserved by the fused image F is computed by the sum of the mutual information between F and A represented by and the mutual information between F and B represented by as illustrated by Eq. ( 11): Larger value of MI indicates that the fused image preserves a significant amount of information from both input images.

D. Edge Association (
) An objective performance measure for image fusion was proposed by Xydeas and Petrović [28].It measures how much of the edge information present in the source images is transferred to the fused image:  E. Universal Image Quality Index ( ) It is a universal objective image quality index proposed by www.ijacsa.thesai.org Wang and Bovik [29].It is a combination of three elements: loss of correlation, luminance distortion and contrast distortion.Loss of correlation measures how much image A and F are correlated, luminance distortion measures how close the mean luminance is to images A and B and the contrast distortion measures the degree of similarity between the contrast of images A and F. It is calculated for each source image and the fusion result.For the three sets of medical source images in Fig. 5, a detailed quantitative evaluation using the previously mentioned quality metrics is presented in Table 1.The best results obtained are formatted in bold in tables 1 and 2.
Table 2 compares the performance of our proposed technique against other existing MIF techniques using the images of Set3 as the source images to be fused.Fig. 6 shows the visual fusion results produced by the compared MIF methods.The fused images obtained from the three sets combine the information from both corresponding source images as shown in Fig. 4(f1)(f2) and (f3).The fused image of Set1 combines the bone structure of the CT image (a1) with the soft tissues of the MR image (b2).In Set2, the lesion that appears as a black hole in the MR image (b2) is apparent in the fused image.Similarly, the fused image of Set3 combines both the bone structure of the CT image (a3) and the anatomical structure of the soft tissues of the MR image (b3).In Table 1 the quality of the fusion result is compared respect to the quality of the corresponding pair of source images.Columns 3 and 4 show the entropy and the standard deviation for each pair of the source images respectively.While the rest of the columns show the performance evaluation of the fusion results for each set through different quality metrics.Apparently, the higher entropy values of the fusion results indicate better information content than the source images that participated in the fusion.Similarly, the higher standard deviation value of the fusion result of Set3 shows better contrast and clarity.Table 2 shows that the proposed MIF algorithm has the highest entropy, MI and Q AB/F .NSCT-MSF-PCNN method [16] has the highest STD and values.The higher values of entropy and MI indicate that the fused image produced in this paper preserves more information from the source images and it has higher information content.The visual fused image obtained by NSCT-MSF-PCNN method [16] is very similar to the fused image produced by our proposed approach; however, the quantitative analysis shows that the proposed algorithm provided higher EN, MI and Q AB/F than NSCT-MSF-PCNN method [16].m-PCNN method [19] in Fig. 5(a) suffers from the contrast reduction problem because of using the normalized value of the coefficient as an input to the PCNN.A close look at Fig. 5(b) shows that NSCT-SF-PCNN method [20] lost large amount of image details.In Fig. 5(e), when NSCT was replaced with the DWT, the fused result revealed unwanted image degradation unlike the proposed method fused result.Careful investigation of the proposed approach in Fig. 5 (d) reveals that it displays very fine details not apparent in the visual result of NSCT-MSF-PCNN [16] in Fig. 5(c).

IV. CONCLUSION
Medical images obtained from different modalities are fused to support a radiologist's task in treatment and diagnosis.Since fusing medical images manually is time consuming and subject to human error, this paper presents an MIF approach based on NSCT and local average energy-motivated PCNN to fuse the medical images.The results show that it overcomes the common drawbacks in the conventional methods such as contrast reduction, edge blurring and unwanted degradations.Using the local average energy as a stimulus to the PCNN is a promising choice, since it doesn't only use the single value of one pixel/coefficient but it also takes into consideration the values of the neighboring pixels.Local average energy extracts features like edges, contours and textures; the human visual system is more sensitive to these features.Selecting the NSCT to transform source images into the frequency domain is a good choice because of its shift-invariant characteristic that overcomes the Pseudo-Gibbs phenomena.Although local average energy showed promising results, we cannot tell that it is the best stimuli for the PCNN to measure the contribution or significance of the source images.Other measurements of activity level instead of the local average energy could be used as a motivation for the PCNN.As a future work, the proposed method in paper can be extended to fuse multi-focus images, infrared and visible images and remote sensing images.Moreover, the behavior of the algorithm will be tested on noisy modalities images to see how it performs in the presence of noise.

Fig. 2 .
Fig. 2. (a) NSCT structure which consists of bank of filters to split the 2-D frequency plane into frequency and directional subbands.(b) Approximation of the ideal frequency partitioning obtained by NSCT

Fig. 3 .
Fig. 3. PCNN's neuron structure ( ) indicates the low frequency coefficient of image A at location i, j in subband S and the same applies for ( ) and ( ).

Fig. 4 .
Fig. 4. Block diagram of the proposed MIF Technique where C is the coefficient value at location m, n in any frequency sub-band.isthen used as input to the PCNN.After running the PCNN for several iterations, firing times for each neuron is calculated.The generated firing maps are used to select which coefficient will contribute to the fused result.

Fig. 5 .
Fig. 5. Three pairs of source medical images (left two images) with the corresponding fusion result of each pair (last column) Fig. 5 shows three sets of source medical images [30] captured from different modalities used in evaluating the proposed approach.Each set is a pair of two source images, and the corresponding fusion visual result is shown beside each pair.In 'Set1' the CT image in Fig. 5(a1) shows the calcification, while the MR image in Fig. 5(b1) captures several focal lesions.In 'set2' the MR images in Fig. 5(a2) and Fig. 5(b2) reveal a lesion in the frontal lobe.In 'set3' the CT image in Fig. 5(a3) indicates a medical left occipital infarct involving the left side of the splenium of the corpus callosum and the MR image in Fig. 5(b3) reveals only mild narrowing of the left posterior cerebral artery.

TABLE I .
PERFORMANCE EVALUATION OF THE PROPOSED MIF ALGORITHM USING SET1, SET2 AND SET3