Research on Image Sharpness Enhancement Technology based on Depth Learning

—Image technology is widely used in security, traffic, monitoring, and other social activities. However, these images carrying detailed information will have feature distortion due to various external physical factors in social ingestion, transmission, and storage, resulting in poor image quality and clarity. Resolution determines the definition of an image. Super-resolution reconstruction is the process of transforming a low-resolution picture into a high-resolution image. To enhance the image clarity, this experiment introduces the advantages and disadvantages of the Super Resolution Convolutional Network (SRCNN) and Fast Super Resolution Convolutional Neural Network (FSRCNN) model and then constructs an image super resolution method based on DSRCNN. The algorithm consists of two sub-network blocks, an enhancement block and a purification block. The model first uses two Convolutional Neural Networks (CNN) to obtain complementary low-frequency information that improves the model's learning ability; next, it employs an enhancement block to fuse the image features of two paths via residual operation and sub-pixel convolution to prevent the loss of low-resolution image information; finally, it employs a feature purification block to refine high-frequency information that more accurately represents the predicted high-quality image. It is found that the PSNR and SSIM of the DSRCNN model can reach 33.43dB and 0.9157dB, respectively.


INTRODUCTION
Pictures captured in social life are often affected by the light, shelter, poor shooting equipment, and other factors. Such shootings have poor visibility, low contrast, high noise, and other shortcomings [1]. These shortcomings make the image definition fail to reach the ideal state, which affects the regular operation of some links to social activities. Picture sharpness is directly related to the image resolution. High-resolution images have more detail and are of superior quality [2]. Processing a series of low-resolution images to obtain high-resolution images is called super-resolution reconstruction. Traditional image super-resolution reconstruction is mainly based on interpolation, machine learning, and multiple images. Researchers value the super-resolution reconstruction technology based on deep learning algorithms because of the ongoing changes in computing technology [3]. The deep learning algorithm mainly learns the high-level semantic features of data through the transformation of multi-level nonlinear structure. It can achieve reasonable and effective prediction and analysis of data [4]. Classical depth learning algorithms include SRCNN, FSRCNN, VDSR, etc. These super-resolution reconstruction technologies can improve image clarity, but there are still some limitations in operation efficiency and visual effects [5]. Therefore, this experiment suggests an image definition improvement technique based on deep learning algorithms to improve the picture quality with inadequate definition, aiming to add reference value to image applications in social activities such as security, traffic, monitoring, etc.

II. RELATED WORK
At present, the application of depth learning is very broad, and it has been dabbled in various research fields. Wang S et al. used depth learning algorithm to recognize tumor images, and introduced the whole process of depth learning algorithm in pathological image segmentation in detail. The research shows that the deep learning algorithm has obvious advantages in recognition accuracy, overall computational efficiency and generalization performance. In addition, integrating in-depth learning into the field of pathological image analysis is a farreaching experiment in medicine [6]. Huang and other researchers found that traditional machine learning methods had limited prediction ability for landslide prone environmental factors, so they proposed a FC-SAE algorithm based on deep learning. To predict categorization, the system employs both high-level and low-level information about environmental elements. The simulation results show that the prediction rate and accuracy rate of FC-SAE can reach 85.4% and 85.2%, higher than other algorithms of the same type, so it has a good application prospect [7]. Sharma and other scholars explored deep neural networks based on machine learning. For the goal of conducting research on the use of machine learning and deep learning in different sectors, they examined and improved the current machine learning and deep learning algorithms. The final survey results show that the accuracy of deep learning algorithms has been significantly improved [8]. Zhang and other researchers conducted research on deep learning algorithms in the field of geological engineering. This paper mainly studies the application of FNN, RNN, CNN and GAN depth learning algorithms in geotechnical engineering, compiles a detailed summary containing literature, cases and depth learning algorithms, and prospects the application of depth learning algorithms in geotechnical engineering [9]. Aslan and others found that malware is constantly developing and seriously endangering the network security environment. Based on this malignant situation, they propose to use deep learning algorithm to detect malicious software. The suggested deep learning algorithm is a novel hybrid architecture that combines two comprehensive pre-training network models, which makes it different from conventional approaches. The www.ijacsa.thesai.org experimental findings demonstrate that the model can classify malware with an accuracy of 97.78% [10]. Bamisile and other scholars used machine learning algorithm and depth learning algorithm to predict solar radiation. SVR, RF, and their learning models as well as ANN, CNN, and RNN deep learning models are each utilized in the experiment to forecast solar radiation, and the outcomes of these predictions are compared. The final results show that the deep learning model has better accuracy and takes less time than the machine learning model in solar radiation prediction [11].
For the research of image enhancement technology, most researchers have made contributions to it. An underwater image enhancement approach based on MLLE was suggested by Zhang and other researchers to address the issues of poor color and low visibility of underwater photos. The method first adjusts the color and detail of the image by the principle of minimum color loss, then uses the integral graph to calculate the mean and variance of the image block, and finally introduces a color balance strategy to balance the color difference of the image. The outcomes of the experiment demonstrate that this technique is very applicable to picture enhancement [12]. Ike C and other researchers proposed a single image super-resolution method based on wavelet. The experiment combines wavelet transform with local regularization anchored domain regression model and applies it to image restoration. Experimental results show that the proposed method is effective and superior in denoising, deblurring and super-resolution reconstruction tasks [13]. Zhang and other scholars proposed a method that can enhance the image contrast while improving the embedded data, namely the improved RDH-CE model. The model offers an adaptive pixel modification technique and extracts several characteristics to aid K-means clustering. According to the final testing findings, this technique has a greater accuracy level for complicated assessment [14]. Liu and other researchers used MSRCR algorithm and guided filtering method to de fog the image. To provide a better visual impact, this approach may successfully address the issues of poor picture lighting, color distortion, and edge loss while preserving the color saturation of the image [15]. A band restricted biphasic technique was suggested by Su et al. to enhance the quality of the reconstructed picture. By using this technique, the precise spatial frequency components impacted by the spatial offset of DPHs may be removed, preventing the picture quality from being compromised. According to the outcomes of the simulation experiment, the strategy enhances the definition of the rebuilt picture by 36.84% [16]. To solve the serious distortion problem of underwater images, Liu et al. proposed an image enhancement method based on object guided dual confrontation contrast learning, which uses contrast prompts in the training phase. The experiment demonstrates that the suggested strategy enhances the image's visual quality and increases the detector's precision [17].
Deep learning algorithm is favored by researchers in various fields for its convenient and powerful performance, and its application is also more extensive. In recent years, the research on image enhancement technology has never stopped, and many scholars have used different algorithms to explore it. In conclusion, deep learning algorithm and image enhancement technology have made considerable progress. It is an innovative attempt to apply the depth learning algorithm to the image definition enhancement technology, so this experiment starts to study this topic.

A. Research on Image Super-resolution Model based on Depth Learning
The resolution of an image determines its definition, and super resolution is the resolution of an image that has been enhanced using a variety of efficient techniques. Super Resolution refers to the technique of processing a number of low-resolution photographs to produce high resolution images (SP), which mainly includes the reconstruction of a single image and the reconstruction of multiple images [18][19]. Super Resolution Convolutional Network (SRCNN) is a milestone for depth learning algorithm to achieve super-resolution image reconstruction [20]. SRCNN uses a three-layer convolutional neural network to achieve image reconstruction, that is, to preprocess the low-resolution image through cubic interpolation. Fig. 1 illustrates the three components that make up the SRCNN model, namely block precipitation and representation, nonlinear mapping and reconstruction. The three modules correspond to three convolution operations in the network.
In Formula (1), 1 W represents the weight of the convolution core of the block precipitation module; 1  indicates the offset of the convolution core of the block precipitation module. max represents the corresponding ReLU activation function; represents block precipitation. The nonlinear mapping of the network is called "convolution+activation", and its functional expression is: In Formula (2), 2 W represents the weight of the nonlinear mapping module's convolution kernel, 2 B represents the offset of the convolution kernel of the nonlinear mapping module;   2 fB represents a nonlinear mapping. The network reconstruction process is the third convolution operation. Unlike the two modules mentioned above, the activation function is no longer used in this process. Its function expression is: In Formula (3), 3 W represents the weight of the convolution core of the reconstruction module, 3  represents the offset of the convolution core of the reconstruction module, and   fB represents the reconstruction. The advantage of SRCNN model is that it contains fewer parameters than other deep convolution neural networks. As a result, by using fewer learning parameters, it may significantly minimize the number of calculations required by the model, thus improving the overall model operation efficiency. The loss function of SRCNN is the mean variance between the super resolution image   fB and the original real high resolution image A . The specific calculation formula is as follows: In Formula (4), n represents the number of samples in the training set,   , i fB represents the minimum reconstructed image, and i A represents the high-resolution image. The loss of mean variance represents the loss mean of a batch, which is an evaluation index of commonly used image high-resolution algorithms. Although the classic SRCNN network has been able to repair low resolution images with different coefficients, the SRCNN network still has limitations that cannot meet the real-time requirements. Thus, FSRCNN network came into being. FSRCNN network consists of five parts: feature extraction, compression, mapping, expansion and transpose convolution [21][22]. The FSRCNN network is used to extract and compress the image features, and then the activation function pair is used for nonlinear mapping. The activation function of FSRCNN may activate the gradient of the negative half axis in addition to the SRCNN's activation function. The precise formula may be written as follows: In Formula (5), i x represents the input signal of the activation function in the i -th channel; i a represents the coefficient of the negative part. The activation function has more stable performance. The calculation complexity of FSRCNN model can be expressed as: In Formula (6) In Formula (7) Compared with SRCNN network, FSRCNN network is mainly improved in three aspects, namely, the change of feature dimension, the replacement of convolution core with more mapping layers and the sharing of mapping layers. Although SRCNN and FSRCNN both improve the quality of super resolution images, they are generally slow and not suitable for real application scenarios. The research on the application of depth learning algorithm to the improvement of superresolution image quality needs to be further deepened.

B. Improvement of Image Definition Enhancement Model based on Fusion of Depth Learning Algorithm
In this experiment, a dual super resolution CNN (DSRCNN) is designed based on SRCNN and FSRCNN models. The network consists of two sub network modules, enhancement blocks and feature purification blocks. Two sub network blocks enhance the super resolution performance of the network by extracting complementary low-frequency features; the enhancement block gathers several highfrequency characteristics via residual operation and sub-pixel convolution; the feature purification block uses multiple stacked convolutions to refine high-frequency features [23][24]. The process of extracting complementary low-frequency features from two sub network blocks of DSRCNN model is as follows: In Formula (9),   k TSEB O represents the output of k sub network, Li O represents the output of the i -th convolution layer, and R represents ReLU function. The purpose of the model's enhancement block is to combine the original picture data and gain deeper data. This module first obtains the original image and deeper information, and obtains the characteristics of the first layer in the sub network: In Formula (10) Formula (12) constructs a five layer feature purification block, which contains four convolution operations and activation functions, and can be used to build high-quality predicted images. In Formula (12), C represents the convolution operation, and SR I represents the predicted SR image. The DSRCNN's execution procedure may be stated as follows: In Formula (13), TSEB f represents the function of two subnets, EB f represents the function of enhancement block, FLB f represents the function of feature purification block, and LR I represents the given LR image. DSRCNN mainly obtains complementary low-frequency information through dual CNN to improve the learning ability of the model. Then, to prevent the loss of low resolution picture information, the enhancement block merges the image characteristics of two separate image routes using residual operation and sub-pixel convolution. In order to properly reflect the anticipated high-quality picture, high-frequency information is refined using the feature purification block [25].The above process is described in detail in Fig. 2 (15) In Formula (15), i represents the feature in the i -th column of the picture, j represents the feature in the j -th column of the picture, and the formula of PSNR can be expressed as: www.ijacsa.thesai.org MSE (16) PSNR is measured in dBs. The less the value is, the less the picture will be warped. The PSNR index, which bases its judgment of picture quality on the associated pixel error, ignores the visual properties of the human eye. Because the human eye has a high visual sensitivity to space, brightness, color, etc., the evaluation results of PSNR are often inconsistent with the subjective feelings of the human eye. Structural similarity requires difference processing for target images X and Y .  X and  Y respectively stand for the mean values of images X and Y ,  X and  Y respectively stand for the variances of images X and Y , and  XY stands for the covariance of images X and Y . The formula is as follows: In Formula (18), 1 c , 2 c and 3 c are three constants, and there is no coefficient of any variable between them, but they cannot be removed. The formula of SSIM can be expressed as: SSIM has a value range of 0 to 1. The picture distortion decreases with increasing value, and image quality and clarity increase. In the calculation process, it is also necessary to cut the global pixels of the image, and the structural similarity obtained is the global structural similarity.

IV. PERFORMANCE VERIFICATION OF IMAGE DEFINITION ENHANCEMENT MODEL BASED ON DSRCNN
To more accurately verify the image super-resolution reconstruction performance of the model, four data sets, Set5, Set14, BSD100 and Urban100, were selected for the experiment, and the contrast experiments were carried out under the conditions that the image magnification coefficient was 2 times, 3 times and 4 times. Set5 consists of five images: children, birds, butterflies, flowers and women. 14 photos make up the data set known as "Set14," which is often used to evaluate the effectiveness of image super-resolution models. The BSD100 dataset contains 100 daily life events, which are often used for image denoising and super-resolution testing. Urban100 is an image data set with the theme of street view, in which the images show complex color characteristics, and their objects have clear shapes and corners. The conditions under which Set5 and Set14's photos were taken are the same. BSD100 and Urban100 both provide 100 photos in natural color. The experimental training set and test set each include 100 natural photos, and the image data in the data set DIV2K is chosen. Table I shows the settings of other experimental parameters.    Table II shows the comparison of PSNR and SSIM values of different modules of DSRCNN in Urban100 dataset with twice the image magnification factor. The results in the table demonstrate that local residual learning is efficient for the development of model performance as the PSNR and SSIM values of the DSRCNN network are better than those of the DSRCNN model with residual learning method. It is observed www.ijacsa.thesai.org that the DSRCNN model is superior to the DSRCNN with only one subnet, and the two subnets in the table model contribute to the performance improvement of the model. The PSNR value of DSRCNN with enhancement block is 0.454dB higher than that of DSRCNN without enhancement block, which verifies the improvement of network performance by enhancement block. The PSNR and SSIM values of the DSRCNN model with feature purification block are 0.16dB and 0.011dB higher than those of the DSRCNN model without feature purification block, respectively, which proves that the feature purification block can improve the network performance. As a result, the three modules in the DSRCNN model proposed in the experiment all have different contributions to their own performance improvement, and this model can improve the image quality.
To better compare the performance of the DSRCNN model proposed in this paper, in addition to the comparison with the SRCNN model, the experiment also selects six other popular depth learning models, a total of 10 models, to conduct experiments to compare and analyze their performance.   5 displays the changes in PSNR and SSIM values for several models using the Set14 dataset with picture magnifications of 2, 3, and 4. When the image is magnified twice, the PSNR value of the DSRCNN model is 33.43, which is the highest among all models. At the same time, its SSIM value is 0.9157, which is the highest among all models. When the magnification of the image is 3 times, the PSNR of the DSRCNN model is still the highest among all models, 30.24, and its SSIM value is 0.8402, which also keeps the highest record. When the image magnification is 4 times, the PSNR value of the DSRCNN model is 28.46, and the SSIM value is 0.7796. Although it has decreased, it still maintains the highest value of all models. On the Set14 dataset, it is clear that the DSRCNN model performs well. Fig. 6 displays the variations in PSNR and SSIM values for several models on the BSD100 dataset at 2, 3, and 4 times the original picture size.When the image magnification is 2 times, the PSNR value of the DSRCNN model is 32.05 and the SSIM value is 0.8978. When the image magnification is 3 times, the PSNR value of the DSRCNN model is 29.01 and the SSIM value is 0.8029, ranking first among all models. When the image magnification is 4 times, the PSNR value of the DSRCNN model is 27.50, and the SSIM value is 0.7341, which is also in the first place. This shows that on the BSD100 dataset, the DSRCNN model still has better performance than other models.   Fig. 7 displays the variations in PSNR and SSIM values between models using the Urban100 dataset with picture magnifications of 2, 3, and 4. When the image magnification is 2 times, the PSNR value of the DSRCNN model is 31.83, and the SSIM value is 0.9252, which is the highest among all models. When the image magnification is 3 times, the PSNR value of the DSRCNN model is 27.76, and the SSIM value is 0.8483, ranking the highest among all models. When the image magnification is 4 times, the PSNR value of the DSRCNN model is 25.94, and the SSIM value is 0.7815, which is always the first of all models. This shows that the DSRCNN model also has good performance on the Urban100 dataset. Fig. 8 shows that on the dataset Urban100, when the magnification factor is 4 times the visual effect of different models, the data in the figure shows that the PSNR and SSIM values are highest for the DSRCNN model. To test the effectiveness of the DSRCNN model based on the deep learning algorithm, four distinct datasets are chosen. The results showed that the PSNR and SSIM values of DSRCNN model fluctuated in different data sets for different models, but among the 10 chosen models, image processing performance as a whole was best. This shows that the DSRCNN model can handle the image processing, making the final image less distorted, improving its separation rate, and thus enhancing the image clarity.  In the data set Set14, when the magnification is 2, the DSRCNN model has the highest PSNR and SSIM values, 33.43 and 0.9157, respectively. Therefore, the DSRCNN model can effectively process the image, reduce image distortion, and improve its resolution, enhancing image clarity. www.ijacsa.thesai.org This experiment basically achieved the purpose of the experiment. In the future, we can consider putting forward effective solutions to the visual shape differentiation in the DSRCNN model.