Image Super-Resolution using Generative Adversarial Networks with EfficientNetV2

—The image super-resolution is utilized for the image transformation from low resolution to higher resolution to obtain more detailed information to identify the targets. The super-resolution has potential applications in various domains, such as medical image processing, crime investigation, remote sensing, and other image-processing application domains. The goal of the super-resolution is to obtain the image with minimal mean square error with improved perceptual quality. Therefore, this study introduces the perceptual loss minimization technique through efficient learning criteria. The proposed image reconstruction technique uses the image super-resolution generative adversarial network (ISRGAN), in which the learning of the discriminator in the ISRGAN is performed using the EfficientNet-v2 to obtain a better image quality. The proposed ISRGAN with the EfficientNet-v2 achieved a minimal loss of 0.02, 0.1, and 0.015 at the generator, discriminator, and self-supervised learning, respectively, with a batch size of 32. The minimal mean square error and mean absolute error are 0.001025 and 0.00225, and the maximal peak signal-to-noise ratio and structural similarity index measure obtained are 45.56985 and 0.9997, respectively.


I. INTRODUCTION
With the rapid development of information technology (IT) along with the boom of Internet technology, information processing based on image and signal is widely utilized by an enormous population, in which image processing is a crucial component of information processing. Here, the role of image super-resolution (SR) is significant when considering image processing-based applications [1], [2]. Image SR is the image transformation from low resolution (LR) to high resolution (HR) for obtaining an enhanced quality image. Medicine, agriculture, industry, and military applications utilize the SR technique due to its high practicability [3], [4]. While considering artificial intelligence, the role of SR is crucial for performing various processes [5], like public security, remote sensing imaging, medical imaging, image compression, and so on using the single image SR criteria [6], [7]. The image resolution enhancement using the up-sampling process lacks texture details. The image transformed into the HR provides enormous information with finer details [8]. For example, the crime scene image offers plenty of evidence for investigating crime. Likewise, an image acquired from the satellite image undergoes various processing, like resource detection, object detection, and several other processing using the HR image [9]. While considering the medical application domain, the disease diagnosis is employed based on Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) Scan images with better resolution for providing accurate medication. Thus, the role of SR is crucial in image-processing application domains.
The SR of the image is obtained from the LR image using three various categories: 1) learning-based approach, 2) reconstruction approach, and 3) interpolation approach [10]. Image resolution enhancement using interpolation is the earliest method most researchers utilized and is easy to implement. Some of the interpolation techniques used to enhance the image's quality are non-uniform sampling interpolation, Bicubic Interpolation, Bilinear Interpolation, and Nearest Neighbor Interpolation. In these approaches, the higher frequency details can be reconstructed through the linear characteristics of the approaches. The image SR using the interpolation approach provides a better outcome; still, the performance degrades with the scaling factor's elevation [11]. Reconstruction based SR approach transforms the LR image by gathering the nonredundant details. The approaches with non-negativity, energy boundedness, support boundedness, and smoothness-based hypothetical constraints using the Projection onto Convex Set are one of the methods utilized for SR reconstruction. The slower convergence rate is the limitation of the reconstruction-based criteria and has many solutions [12], [13]. In addition, the reconstruction solution acquired at the final stage hang-on on the initial evaluation. Also, the performance is limited while considering reliable robustness and real-time modeling due to the inefficiency in handling the noise level reconstruction [14]. Finally, the third approach is the learning-based image transformation approach that enhances the quality of the image using machine learning and deep learning algorithms [15]. Nowadays, learning-based image quality enhancement is the widely utilized approach by researchers [16] due to better image perception.
The learning of the network is employed in the learningbased approach of image SR for providing a high-quality reconstructed image. Here, for network learning, the high representation of the samples is utilized with variation in data for generalizing [17]. Thus, enormous data is acquired from various sources to obtain the required solution. While considering the image taken from the remote sensing domain, the information collection is a challenging task due to the variations of image based on the factors like different sensors and locations along with the difference in the objects [18], [19], [20]. Thus, the network learned with the limited samples affects the model's performance due to the poor generalization capability. Nowadays, the advent of deep learning models based on nonlinear operation in artificial intelligence accumulates enormous www.ijacsa.thesai.org samples that can be utilized for several computer visionbased applications [21]. Hence, the generalization capability of the deep learning models is elevated through the enormous learning of data samples. The convolutional neural network (CNN) based image resolution enhancement technique was first suggested by Dong et al. [22]. The method acquired a better signal-to-noise ratio (SNR) with a high sampling rate, but the information about the image is lost due to the too smoothness of the reconstructed image [23], [24]. In the deep learning approaches, the error concerning the mean squared error (MSE) reduction of 2.98 and the elevation of the peak signal-to-noise ratio (PSNR) value of 21.15dB are obtained through the optimization target for acquiring the high-quality image from the ill-posed nature. The variation between the reconstructed and the original image based on the perception is not photorealistic by some existing approaches. Generative adversarial network (GAN) based methods are also designed by several researchers to overcome the issue concerning information loss in image reconstruction, in which the CNN is replaced with the GAN network [3]. The GAN-based image SR techniques gather the non-linear high-dimensional features from enormous data [25], [26] and make the generalization more effective for providing a better outcome.
This study aims to reconstruct the SR image from the LR image by minimizing the perceptual loss to obtain a perceptually good image with detailed information that gratifies the human eye. This study proposes ISRGAN with EfficientNet-v2 to enhance reconstruction accuracy with minimal distortions. The ISRGAN is designed for image reconstruction by minimizing the perceptual loss comprising of content and adversarial loss with EfficientNet-v2 is utilized for learning the discriminator of the ISRGAN, in which the perceptual loss concerning the image is minimized for the acquisition of a better quality image.
The significant contributions of the research are: • Design of ISRGAN: Image Super-Resolution Generative Adversarial Network (ISRGAN) is designed for image reconstruction by minimizing the perceptual loss comprising of content and adversarial loss.
• EfficientNet-v2: EfficientNet-v2 is utilized for learning the discriminator of the ISRGAN, in which the perceptual loss concerning the image is minimized for the acquisition of a better quality image through the fast learning criteria.
The manuscript is organized as follows: The primary methods review is enclosed in Section II with its problem definition. The detailed proposed ISRGAN based on EfficientNet-v2 is portrayed in Section IV, Section V for implementation, and Section VI discusses the results and analysis of the study. Finally, Section VII concludes the work.

II. RELATED WORK
The primary methods concerning the image SR are detailed in this section. The image SR using the GAN concerning the quality loss was devised by Zhu et al. [27] to solve the issue of instability of the GAN in image reconstruction. The method addressed the loss issue by optimizing the pre-trained network. The gradient magnitude similarity deviation was utilized for the discriminator training to obtain a better performance. In addition, batch normalization, computation overhead, and memory reduction make learning more efficient and obtain better visual appeal. Image SR using GAN with the fused attentive network to extract global features was designed by Jiang et al. [28] using the scanning time reduction technique. In this study, the attributes at various scales are taken out based on the attention criteria to acquire the most relevant attributes and enhance reconstruction accuracy. In addition, the stability of the learning is accomplished through the spectral normalization approach. The analysis based on the distance measure and the similarity index offered a better outcome.
Shahidi [29] designed the wide attention-based GAN to stabilize the learning process using the Wasserstein with a Gradient penalty. Two different normalizations, such as weight and batch normalization, elevate the similarity index of the image by considering the texture and color restoration. Besides, the inclusion of the self-attention layers and the residual blocks assures a high-quality image by learning past information. The self-attention layer is utilized in the discriminator to extract the attributes from the image patches to correct the generator more effectively. The loss function was evaluated using the VGG-19 network. An enlightened GAN was developed by Gong [30] using the self-supervised hierarchical perception loss to acquire enhanced image reconstruction performance through network induced convergence. The enlighten blocks were introduced to accomplish a better gradient using the improved generalization capability. Besides, the occurrence of seam lines in the reconstructed image was eliminated through the clipping and merging approach-based learning criteria using the batch internal inconsistency loss. The image quality assessment of the method provided a superior outcome by solving the merging issue that depicts the realistic criteria of the developed method. GAN with cascading residual network was designed by hybridizing the neural network (NN) with the GAN by Ahn et al. [31] to acquire the network's short connection and multi-level representation for improving the reconstruction performance. The balanced distortion and the perception criteria of the designed method are made using the multi-scale discriminator and GAN-based learning. Here, the usage of the multi-scale discriminator gathers fine information from the image concerning the resolution to reduce the losses in the network. The visual outcome of the method provided a realistic and sharp image.
Residual-in-Residual Dense Block-based GAN was devised by Song et al. [32], in which the optimal network with transfer learning was employed to acquire a high-quality image with better perception and low distortion. The slow convergence rate and the unstable learning capability were enhanced through the inclusion of the dense block with enhanced perception and minimal distortion in the reconstructed image. In addition, the feature reuse and propagation were improved along with the vanishing gradient issue solution, which was accomplished through the dense connection establishment with minimal parameters. A conditional GAN-based image reconstruction was presented by Sun et al. [33], in which the color space transformation was initially employed based on the variance and mean of the channel for the channel normalization. The better visual representation of the image was accomplished through the colorization training criteria, which converge faster than the traditional image reconstruction approach. In addition, curriculum learning was utilized to solve the issue of large resolution differences in the learning phase. Color normalization through self-supervised learning offers better generalization capability to obtain better performance by eliminating color variance. A weighted Multi-Scale Residual Block-based image reconstruction strategy was introduced by Li et al. [34] for image reconstruction through the weighted feature representation approach. In this, the low-level attributes were taken out from the image. Then, using the global residual learning, the high-frequency information from the features is mapped using the non-linear mapping criteria. The reconstruction of the image was accomplished from the reconstruction subnet of the network.The benefits and the challenges of the reviewed techniques is depicted in Table I.

A. Research Gap
The acquisition of high resolution image from the low resolution input image is the image super resolution technique. The learning based methods of image super resolution are widely utilized for reconstructing the image to accomplish the better performance through the non-linear learning capability. The goal behind the image reconstruction is to obtain the better perceptual image to identify the target solution in medical imaging and satellite imaging applications. Many of the prior researchers devised the GAN based image reconstruction technique for obtaining better performance. Still, the loss function estimation and the network training with minimal loss is a challenging task in most of the methods based on the GAN. In addition, the un-pleased image reconstruction with higher distortions and the over-smoothing of the image degrades the better reconstruction. Hence in order to overcome the above mentioned challenges, an ISRGAN is introduced for learning the discriminator with higher learning speed with minimal information loss, which enhances the generalization capability of the network and obtains the better image.

III. METHODOLOGY
The ISRGAN is designed by modifying the loss function of the traditional GAN to enhance the image quality that is more pleasant to the human eyes and to improve the accuracy of image reconstruction. The GAN for the super image resolution of the single image is termed ISRGAN for transforming the LR image into the SR image. The ISRGAN comprises of discriminator and generator, which denotes A θA the discriminator model. The min-max issue of the ISRGAN is solved by optimizing Q θQ alternatively and A θA . The formulation for the min-max issue is expressed as, where, the high-resolution image is notated as S R , the low-resolution image is indicated as S L , the discriminator is notated as A, and the generator is indicated as Q. Here, the role of the generator is to mislead the discriminator, in which the discriminator module of the ISRGAN is learned to identify the difference between the real image and the SR image. Similarly, The enhanced adversarial learning along with the loss function estimation using the image quality assessment provided a high quality image.
The failure case of the method was higher and provides an unsatisfactory result. Besides, the imperfection of the network portrayed diagonal lines in the blocks of the outcome.
Fused Attentive GAN [28] The assessment with various GAN-based image superresolution acquired better performance in terms of signal-to-noise ratio using various datasets. Besides, the reconstructed image is near to the real image with high resolution.
The reduction of the loss function with a higher number of iterations elevates the computation overhead.
Wide-Attention GAN [29] Two normalizations like weights and batch normalization along with the self-attention layer provide the method a small range of gradient clipping that elevates the accuracy of image reconstruction.
The analysis without the preprocessing step limits the performance of the model.

Enlighten-GAN [30]
The convergence stabilization along with the network intensification provided a better outcome with an artifact-free reconstructed image by maintaining the hue and shape.
The outcome accomplished using the method still has the issue of an unclear outline due to the object fusing technique. Hence, for the complex image, the developed method provided poor performance.
Cascading Residual Network-GAN [31] The improved performance is accomplished for the image translation technique by balancing distortion and perception.
The complexity analysis based on the execution time depicts the minimal time complexity compared to some baseline techniques; still, acquired poor performance due to the memory fragmentation criteria that limit the parallelism of the approach.
Residual-in-Residual Dense Block-GAN [32] The visual outcome of the method was better and was applicable for the applications that don't require the details concerning the place and obtained better running time.
The signal-to-noise ratio evaluated by the method was not pleasant to the human eye.

Weighted
Multi-Scale Residual Block [33] The weight-based image reconstruction obtained better performance using the high-frequency attributes acquired from the low-resolution image.
The method failed to analyze the accuracy.
Conditional GAN with color normalization [34] The augmentation of the features helps to acquire various degradations degrees, which assures the capability of learning from the de-blurred image that enhances the convergence rate and boosts the performance of the method.
For the images with severe information loss, the method offered an unsatisfactory outcome. the generator module is trained to produce the SR image close to the real image, making the discriminator's detection capability a tedious task. As mentioned above, the generator and the discriminator processes provide a more appropriate perceptual solution. The maximization issue is solved using the discriminator with minimal MSE. The architecture of the ISRGAN is portrayed in Fig. 1.

IV. PROPOSED WORK
The proposed ISRGAN with EfficientNet-v2 incorporates the EfficientNet-v2 to evaluate the loss function and train the discriminator based on the loss to enhance the quality of image reconstruction. The evaluation of the loss function and the www.ijacsa.thesai.org A. Loss Function 1) Perceptual loss : The perceptual loss function of the ISRGAN is considered a crucial factor that elevates the perceptual characteristics of the image based on the loss function [4]. The adversarial and content loss function of the ISRGAN constitutes the perceptual loss function and is formulated as, where, the adversarial loss of the ISRGAN is referred to as 10 −3 h H G , the content loss of the ISRGAN is indicated as h H B and the perceptual loss evaluated by the VGG-based network is indicated as h H .

2) Content loss:
Here, the goal of the ISRGAN is to replace the perceptual loss using the VGG feature map of the network. The loss function of the pixel-wise MSE, evaluated by VGG, is formulated as, where, the MSE error estimated by the VGG is indicated as h H E , and the height and width of the low-resolution image are indicated as N and K, respectively. For the image points (m, n), the gray level value is indicated as S L m,n . Then, the characteristic loss of the VGG network is formulated as, where, H refers to the image SR, R refers to the highresolution image, h H V GG/u,v refers to the low-resolution image, L refers to the characteristic loss for the VGG network, the reference image is indicated as S R , and the reconstructed image is indicated as Q θQ .u t h Max-pooling layer with v t hconvolution corresponding to the feature map is indicated as α u,v . The dimensions of the feature maps are indicated as N u,v and K u,v , respectively. Here, the characteristics loss function is evaluated based on the activation function of VGG loss, which is nothing but the Euclidean distance between the reference and reconstructed image. The combination of the characteristic loss and the MSE error constitutes the content loss of the ISRGAN.

3) Adversarial loss:
The loss generated by the generative component of the ISRGAN for providing the most appropriate solutions by tricking the discriminator of the network is termed Adversarial loss, which is formulated as, where, the natural high-resolution image is indicated as Q θQ S L , and the reconstructed image's probability is notated as A θA Q θQ S L . Here, the adversarial loss function needs to be minimized for the acquisition of better gradient behavior.

B. EfficientNet-v2 based Perceptual Loss Function
The proposed improved ISRGAN utilizes EfficientNet-v2 to estimate the loss function to enhance the quality of the reconstructed image perceptually. The ISRGAN has the issue of gradient disappearance that makes information losses due to poor generalization capability. Thus, the EfficientNet-2 is utilized for learning the discriminator based on the perceptual loss function through the higher learning rate and the parameter efficiency. Here, the role of the EfficientNet-v2 is to detect the difference between the ground truth and the reconstructed image for measuring the loss function to train the discriminator to make the reconstruction more effective. The loss functions like content loss and the adversarial losses that constitute the perceptual loss are evaluated using the EfficientNet-v2 to enhance the efficiency of the image SR.
EfficientNet-v2 for estimating the loss function utilizes smaller kernel sizes with minimal memory access overhead. Besides, the last stride-1 is eliminated to reduce the memory access overhead and size of the parameter. The runtime overhead is minimized through network capacity elevation, and the training overhead, along with higher memory, is minimized by restricting the interference of the image. The learning capability of the EfficientNet-v2 is higher by making the original interference size for learning, which depicts the scaling characteristics of the loss function estimator. The building blocks of the EfficientNet-v2 are Fused MBConv along with the mobile inverted bottleneck MBConv and are portrayed in Fig. 2.

C. ISRGAN Architecture
The ISRGAN comprises two different modules: the discriminator and the generator for producing the perceptually quality image.

1) Generator:
The transformation of the input image to obtain the SR image is employed in the generator module of the proposed ISRGAN. Initially, the encoding of the input image is performed to acquire the high-dimensional information. Then, the decoding is employed in the hidden layer of the generator to provide high-resolution images. Here, the image dimensions are scaled using the scaling algorithm to acquire the feature resolution, which is performed as convolutional and deconvolutional operations in the generator. Thus, the CNN is utilized in the generator module of the ISRGAN to extract the relevant features with a single dimension, which leads to the failure in acquiring the granular patterns of the image. Thus, the compound scaling factor is introduced in the ISRGAN to obtain the image's finer granular patterns. Hence, the generator module is designed with 42 layers of convolutional and deconvolutional, along with the Leaky rectified linear unit (Leaky ReLU) activation function and batch normalization. Besides, training aware Neural Architecture Search (NAS) and scaling are utilized for designing the generator module.
2) Discriminator: The difference between the highresolution image generated by the generator module and the real image is identified by the discriminator module using the EfficientNet-v2. Here also, finer granular patterns are extracted from the image. Seven Fused-MBConv, one convolutional layer, and eight hidden layers constitute the discriminator module along with the EfficientNet-v2. The sigmoid function is utilized to determine the likelihood of SR images by considering the fake and real images. The perceptual loss function training is employed alternatively in both the discriminator and generator to obtain perceptually better image SR.

V. IMPLEMENTATION
The experimentation setup along with the interpretation of the introduced ISRGAN is detailed in this section.

A. Experimental Setup
The implementation of the proposed method is employed in Google Colaboratory (Colab) Notebooks tool. The total number of iterations performed is 100, with batch sizes of 8 and 32.
1) Image dataset: The dataset utilized for the analysis of the proposed ISRGAN is the CIFAR-10 dataset [35] comprises 10000 test images and 50,000 training images with 10 different classes. Thus, a total of 60,000 images with the size of is utilized. The type of image utilized for the evaluation is synthesis.
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 14, No. 2, 2023 2) Network learning: The generator module of the ISR-GAN is learned with the loss function, expressed as, (6) where, Loss Q refers to the loss utilized to learn the generator, h H notates the perceptual loss, which comprises adversarial loss and content loss and the texture loss is indicated as Loss texture . Here, the Adam optimizer is utilized to solve the optimization issue, and the learning rate of 1e −4 is initially utilized. EfficientNet-v2 is utilized for extracting the finer granular patterns from the image.
3) Loss function: The empirical risk of the proposed IS-RGAN in reconstructing the HR image is evaluated based on the loss function and is enunciated as, where, the total reconstructed samples is indicated as W , the HR reconstructed image by the proposed ISRGAN is O i and the targeted outcome is indicated as O i .

4) Network parameter setting:
The setting for the network parameters of the Generator and Discriminator in the ISRGAN architecture are displayed in Table II.

A. Qualitative Analysis
The experimental outcome concerning the input LR image and the output HR image is shown in Fig. 3.
The outcome acquired by the introduced ISRGAN shown in Fig. 3(b) is visually pleasing to the human eye by reconstructing the LR image 3(a). The discriminator learning based on the perception loss estimated by the generator using the EfficientNet-v2 elevates the accuracy of image reconstruction by minimizing the MSE.

B. Quantitative Analysis
The quantitative analysis based on the loss function evaluated by the proposed method and the error analysis is detailed in this section.  Table III presents the batch sizes, number of iterations, and outcomes of generator, discriminator, and the self-supervised. From the table, we can observe that the increase in batch size and iteration minimizes the loss. Here, the increase in the batch size of the method reduces the noise in the gradients, which assures enhanced accuracy in image reconstruction and reduces the loss. Similarly, the increase in iteration helps the algorithm obtain a closer solution to the target solution, which reduces the loss and enhances image reconstruction accuracy. The ISRGAN with the EfficientNet-v2 acquires minimal loss and can be applied for the media, medical, and surveillance-related application domains by detecting various objects in the image. Besides, the loss function evaluation based on the perceptual similarity loss assists in capturing the high-level information that provides a more accurate solution.  2) Error and similarity analysis: The analysis is executed based on MSE, PSNR, structural similarity index measure (SSIM), and mean absolute error (MAE).

MSE:
The analysis of the risk function concerning the target solution in reconstructing the image is measured through the MSE and is formulated as, Here, the targeted outcome is indicated as I Target , the outcome acquired by the proposed ISRGAN is indicated as I outcome , the total number of samples is indicated as T samples and the MSE is notated as ISRGAN MSE . The MSE evaluated by the proposed ISRGAN and the comparative methods like Wide-Attention GAN [29], Enlighten GAN [30], and Fused Attentive GAN [28] is portrayed in Fig. 5. The MSE evaluated by the introduced ISRGAN with the EfficientNet-v2 is minimal compared to the other conventional methods.  (9) where, the MAE measure is indicated as ISRGAN M AE . The outcome of the proposed ISRGAN, along with the comparative methods, is portrayed in Fig. 6. The MAE evaluated by the proposed method is minimal compared to the other state-of-the-art techniques. PSNR: The quality of the image reconstructed from the low-resolution image is measured through the PSNR. The greater value of PSNR depicts better reconstruction. The formulation of the PSNR of the reconstructed image is expressed as, where, the PSNR measure is indicated as ISRGAN P SN R , and the fluctuation of the reconstructed image compared to the original image is denoted as R. The PSNR estimated by the proposed and the comparative methods is portrayed in Fig. 7. The PSNR acquired by the introduced method is higher and depicts a better image reconstruction quality.
where, the SSIM is indicated as ISRGAN SSIM . The mean and the variance of the reconstructed image are indicated www.ijacsa.thesai.org as µ p and σ 2 p , and the mean and variance of the original image are indicated as µ q and σ 2 q . The covariance between the original and reconstructed image is notated as σ pq , and the variable utilized for the stabilization is indicated as s 1 and s 2 , respectively. The outcome of the proposed method, along with its comparative methods, is portrayed in Fig. 8. The higher SSIM evaluated by the proposed method depicts the similarity of the reconstructed super resolution image with the original image.
3) Comparative discussion: The comparative analysis of the proposed method, along with the conventional image reconstruction techniques, is portrayed in  The proposed method incorporates the EfficientNet-v2 with the traditional GAN for enhancing the image super resolution quality by minimizing the perceptual loss. For this, the EfficientNet-v2 is utilized at the discriminator that minimizes the information loss with fast learning rate. Thus The Fused GAN was devised by integrating the attention module in the traditional GAN for enhancing the performance of the image super resolution technique. The quality assessment based on the similarity measure was closer to the proposed method; still, the higher time complexity and failure in maintaining the system parallelism mechanism limits the method. Thus, the existing Fused GAN acquired little minimal performance compared to the proposed GAN with EfficientNet-v2. Followed by, Wide Attention GAN devised by integrating the self-attention module. Due to the inclusion of the attention mechanism, the introduced Wide Attention GAN also suffered with higher time complexity. Finally, the Enlighten-GAN that was designed by incorporating enlighten blocks into the traditional GAN. Here, the introduced method accomplished better quality image as outcome by solving the unstable convergence issue through the enlighten block. However, method is inefficient in processing the image with complex background. Thus, the analysis based on the error estimation and the similarity analysis shows that the proposed method accomplished better performance than other super image resolution methods. The training of the discriminator using the EfficientNet-v2 provides fast learning and the least loss function based on the perception enhancing the image's visual quality.

VII. CONCLUSION
The SR image reconstruction from the LR image for the acquisition of fine-grained information from the image, along with the visual quality, is attained using ISRGAN with EfficientNet-v2 in this paper. The proposed image reconstruction technique elevates the quality of the image by minimizing the perception loss present in the GAN network. The information loss is minimized through the efficient learning of the discriminator using the EfficientNet-v2, which has the probability of a fast learning rate with more accurate learning based on the perception loss generated by the generator module of the ISRGAN. The learning capability of the EfficientNet-v2 is higher by making the original interference size for learning that depicts the scaling characteristics of the loss function estimator. Thus, the proposed method provides a promising outcome based on the error analysis and acquires the minimal values of loss of 0.02 at the generator, 0.1 at the discriminator, and 0.015 at the self-supervised learning, respectively, with a batch size of 32. Besides, the minimal MSE and MAE accomplished by the proposed method are 0.001025 and 0.00225, respectively. Likewise, the maximal PSNR and SSIM acquired by the proposed method are 45.56985 and 0.9997, respectively. However, the loss of the method further needs to be reduced for real-time processing.
In the future, we plan to introduce a novel hybrid metaheuristic optimization-based deep learning approach to overcome the challenges the proposed model encountered in this study.