Application of Image Style Transfer Based on Normalized Residual Network in Art Design

—With the development of computer vision technology, image style transfer technology based on deep learning has achieved vigorous development. It has been widely applied in fields such as art design, painting creation, and film and television effect production. However, existing image style transfer methods still have shortcomings, including low efficiency and weak quality of style transfer, which cannot better meet the actual needs of various art and design activities. Therefore, a residual network structure is introduced to construct an image style transfer model based on the convolutional neural networks. Meanwhile, a normalization layer is added to the residual network results to optimize the image style transfer technology. An image style transfer model based on the normalized residual network is constructed. The experimental results show that the accuracy, recall, and F1 values of the improved image style transfer model proposed in the study are 97.35%, 96.49%, and 97.52%, respectively, which can complete high-quality image style transfer. This indicates that the image style transfer model proposed in the study has good performance, which can effectively improve the efficiency and quality of image style transfer, providing effective support for various art and design activities.


I. INTRODUCTION
With the development of technology and social progress, various digital media have become an indispensable part of daily life.As a medium for the public to showcase their unique pursuit of individuality, the demand for artistry in images is increasing.Many scholars have begun to explore how to automatically synthesize images with artistic style through computers [1].Style transfer emerged as the times require.Image style transfer is often regarded as a universal texture generation method for research.Therefore, image style transfer (IST) has become a research highlight.IST technology refers to the process of converting an image into an image that is similar in style or content to the target image through machine learning methods and other techniques [2,3].With the progress of deep learning (DL) technology, some scholars build image style migration methods based on the advantages of Convolutional neural network (CNN) feature extraction, while others propose to build image style migration methods using encoder and decoder models [4].However, existing image style transfer methods have a slower speed during the transfer process.It cannot meet the application requirements in real scenarios.Meanwhile, the comprehensive quality of style images obtained by existing image style transfer methods is not high.In deep learning technology, convolutional neural networks are feedforward neural networks based on convolutional operations and have deep structures.They are widely used in fields such as image perception, image analysis, image classification, and natural language processing.Therefore, based on the existing CNN structure, an improved image style transfer method is constructed [5].The structure of CNN is optimized by introducing residual network.At the same time, in the image transfer model, besides adding residual blocks, a normalization layer is added to solve the image quality problems.An image style migration Generative model based on the deep residual network instance normalization (ResNet-IN) is constructed.It is expected to achieve higher quality image style transfer, and improve the speed and efficiency of IST.
The research mainly includes five sections.The Section I is an introduction, briefly introducing the research background.The Section IImainly introduces the research review of residual networks and image generation methods based on deep learning.The Section III constructs an image style transfer method based on a normalized residual network.The Section IV validates the performance of the constructed method.The Section V summarizes the research results and points out the future research direction.

II. RELATED WORKS
A residual network is composed of multiple residual blocks stacked together, which has the advantages of easy optimization and improved accuracy.The internal residual blocks use jump connection, which relieves the vanishing gradient caused by increasing in depth network.Therefore, it is widely used in various fields.Liu et al. conducted recognition research on Boletus images using residual networks.The Fourier transform near-infrared spectral information of the uneven stipe of Boletus edulis was collected.This information was combined into four data matrices to accurately evaluate and identify different Boletus species.According to the findings, the residual neural network (ResNet) has better performance [6].The Generative adversarial network and multi style ancient book background model were used to train the model.In the experiment, ancient materials such as Yi characters, ancient Chinese, Jurchen characters and ancient paintings were selected as samples for testing.According to the findings, it improves the performance of the Generative model [14].Chen et al. constructed a conversion method consisting of multiple convolutional filter banks to better achieve image style conversion.Each filter bank (FB) clearly represents a style.To convert the image into a specific style, the corresponding FB operates on the intermediate features generated by a single automatic encoder.The style library and automatic encoder are learned in a joint manner.The results indicate that this method can effectively achieve image style conversion [15].In summary, relatively rich research results have been achieved on residual networks and image style transfer techniques.However, in existing research, various image style transfer techniques proposed are mainly designed for specific images, such as clothing, ancient Chinese characters, etc.The practicality cannot meet the needs of various artistic creation activities.At the same time, existing image style transfer technologies have shortcomings in terms of efficiency and quality, and cannot achieve high-quality style image transfer.Furthermore, with the development of computer vision technology, the application demand for image style transfer in various art designs is becoming increasingly widespread.Therefore, based on existing research results and the advantages of residual network structure in image feature processing, a normalized residual network based image style transfer model is constructed.It is expected to further expand the application scope of image style transfer technology through this method, improve the quality and efficiency of image style transfer, and meet a wider range of artistic design needs.

III. IST GENERATION NETWORK DESIGN BASED ON NORMALIZED RESIDUAL NETWORK
IST is crucial for in various art and design activities.Therefore, this chapter will design the corresponding IST model based on CNN.Meanwhile, the normalized ResNet is applied to optimize the image style Generative model to better realize the IST.

A. IST Based on CNN
In the computer vision, IST is a common texture production method, which extracts texture feature information from an image and inputs the extracted texture feature information into the target image to obtain a stylized image.CNN is the most effective machine learning network in image migration processing.CNN is a feedforward neural network based on convolution operation, which is extensively applied in image perception, image analysis, classification and other fields [16].The CNN is composed of convolution layer, pooling layer and full connection layer.In the CNN, VGG-Net network structure shows better performance in digital image processing.The research takes VGG-19 as the basic network structure for image style transfer.When the CNN completing the forward propagation task, the convolution core slides in the image matrix.By performing a convolution operation with the pixel values of the convolution kernel coverage position, the corresponding feature map is obtained, as shown in Formula (1).
In Formula (1), I stands for the input image of CNN. ( , ) S i j stands for the coordinate position of the feature image.K represents a convolutional kernel of size mn  .The convolution operation process of the input image is shown in Fig. 2. Firstly, the input image is defined as x .The output image is the expected image after transfer, represented as y .The convolutional layer used in the style transfer process is l .The convolutional features obtained from the input image after calculation in layer l are represented as l ij M .Among them, i stands for the i -th channel in the convolutional layer.j stands for the j -th position in the convolutional layer.The convolutional features of the image y after style transfer in the l -th layer are represented as l ij F .Therefore, the obtained image content Loss function (LF) is shown in Formula (2).( , , ) ( ) The content Loss function can describe the difference between the style of the input image and the output image.The smaller the value, the smaller the differences between the image before and after the style transfer.Then the style features of the image are defined.The image style is represented by Gram matrix (GM), which is a Symmetric matrix composed of several vector inner products, as shown in Formula (3) [17].
In Formula (3), l ij G represents the white noise image matrix.k represents the k -th position in the convolutional layer.The style error after image transfer is shown in Formula (4).
In Formula (4), l M and l N represent the height and width of the l -th layer.l G represents the l -th layer GM of the input image x .The GM of the l -th layer of the migrated output image y is l A .l E satnds for the GM of image style.Formula (4) has been simplified as shown in Formula (5).
In Formula ( 5), 1  satnds for the weight of the l -th layer.The content LF and style LF of IST are established through CNN.IST can be seen as the unity of content and style.After copying the target image for style transfer, whether an image is closer in content or more similar in style depends on the proportion of content loss and style loss in the total loss.Thus, the total LF is shown in Formula ( 6).
In Formula (6), conctent O stands for the expected content image.style O stands for the expected style image.R stands for the image that needs to be generated. and  are hyperparameters used to measure the proportion of style loss and content loss in total loss.The results of IST based on CNN usually cannot preserve the local texture features of the original image.Therefore, there are issues with the representation of depth features and the accuracy of image transfer models when using this method for image style transfer.

B. Image Style Transfer Model Construction Based on Improved Normalized Residual Network
The IST network model, as a type of generative network, should have strong feature learning ability and can achieve accurate and reasonable processing of various image features.In response to the shortcomings of the above image style transfer methods, a residual network structure is introduced to construct an improved normalized residual network (ResNet-IN) image style transfer model.The residual module is added to the basic CNN structure.Deep residual neural network is a meaningful branch in the DL.ResNet can effectively solve the defect of gradient disappearance in the training of CNN by introducing residual blocks.It is widely used in fields such as image classification, object detection, and face recognition [18].The core of ResNet is residual blocks.In the convolution layers with different depths of the CNN, ResNet introduces a special connection layer for information transmission, that is, skip connections, as shown in Fig. 3  Jumping connections can cross a multi-layer network structure to establish a connection between two non-adjacent convolutional layers, thereby achieving information transmission.In Fig. 3, x represents the feature image output from the previous network layer.Then, x is input into the residual block for convolution operation.Finally, the features containing residual information and () Fx are output.The expression of () Fx is shown in Formula (7).

( ) ( )
In Formula ( 7), 1 W and 2 W stands for the weights of the weight layer, respectively. represents the excitation function.The final output of the residual block is represented as () Hx, as illustrated in Formula (8).
( ) ( ) When constructing an image style transfer model, it includes a residual network module and a sampling network module.The image undergoes a two-step convolution operation to achieve down-sampling.Then it is sent to the residual network module for processing.On a convolution with a step size of 1/2, it is sampled and restored to the original size output network.The basic process is shown in Fig. 4. In this process, the output part of the network uses the Tanh function to limit the image pixel value between [0, 255].The image style transfer generation network constructed through research is regarded as the objective function () y f x  .x represents the initial image.y stands for the image after style transfer.ŷ stands for the output value.
() z x  refers to the value of the Activation function of layer z .In volume z , the output value of the Activation function is a feature mapping of z z z C H W . z C represents the feature maps.z H and z W represent the length and width of the convolutional layer.The Euclidean distance between the feature maps of the objective function to be optimized is shown in Formula (9).
Currently, the commonly used residual generation network can achieve rapid transfer of image styles.However, the final image migration effect cannot achieve the expected results.Therefore, the normalization idea is adopted to improve the residual generation network.In the image transfer model, in addition to adding residual blocks, a normalization layer is also added.This structure is to solve the contrast problem that occurs during the implementation of style transfer in the initial image.The contrast of the migrated image generated by the IST generation network is less affected by the contrast of the content image, resulting in a higher quality IST effect.Specifically, the IST network includes basic operations such as convolution, redundation, sampling, and normalization.
However, the basic batch normalization layer defines normalization on all images in the dataset.This normalization method is greatly affected by noise and consumes a lot of storage, resulting in low applicability.Therefore, the Instance Normalization (IN) method is introduced to normalize each sample image one by one.The mathematical definition is shown in Formula (11) [19].
The specific process of instance normalization operation is shown in Fig. 5.
Instance normalization is used to replace the task of batch normalization.It can effectively enhance the learning efficiency of the model and avoid various changes in data statistical characteristics.The effect of IST is often evaluated as Mean Intersection over Union (MIoU).The obtained style transfer images are evaluated using the MIOU indicator [20].MIoU is the standard metric for semantic segmentation, which is typically used to calculate the intersection and union ratio of two sets.In the measurement of IST, two sets are style images and style transfer images, respectively.The calculation method is shown in Formula (12).In Formula ( 12), k is pixel quantity.ij p represents the intersection to union ratio of each pixel.i stand for the true value.j stand for the prediction.The larger the MIoU is, the smaller the difference between the predicted and the true value.In IST, the closer the style transfer image is to the initial style image.

IV. PERFORMANCE ANALYSIS OF IST GENERATION NETWORK BASED ON NORMALIZED RESNET
A Generative model of IST based on normalized residual network is constructed on the basis of CNN.This chapter will test and verify the performance of the image style Generative model.Then the model is applied to actual IST and the transfer effect is analyzed.

A. Performance Analysis of Improved Normalized Residual Network
To better validate the performance of the image style transfer model based on an improved normalized ResNet, relevant experiments are designed to verify the availability of this method.The environmental design is illustrated in Table Ⅰ.Firstly, the availability of the IST based on residual network is verified.The dataset used in the model training process is Microsoft COCO, which contains over 80000 images.5000 images are randomly selected for model training, and an additional 1500 images are selected as the test set.The image feature data is input into the residual CNN for iterative training.The LF is used at the output end of the model to predict the error.The LF and training accuracy of the proposed method under different Learning rate are illustrated in Fig. 6.From Fig. 6, when the Learning rate is 0.0001, the model is prone to over fitting.When the Learning rate is 0.01, the model precision does not get the optimal solution.When the Learning rate is 0.001, the loss value of the model is 0.2.The corresponding accuracy fluctuation under the Learning rate is the smallest and the accuracy is the best.Therefore, 0.001 is taken as the Learning rate of model training.To better compare the performance of different methods, the IST method based on improved normalized residual network proposed in the study is compared with commonly used methods.CNN algorithm, based on batch normalization of image style transfer model (CNN-BN), Fast Neural Style Transfer (FNST), and A Neural Algorithm of Artistic Style (NAAS) algorithms are used for comparison [21].The AUC values of several models were compared on the training and testing sets, as displayed in Fig. 7.In Fig. 7

B. Application Effect Analysis of IST Generation Network based on Normalized ResNet
To explore the transfer effect of the image style transfer model based on normalized residual network proposed in the study on different style images, eight types of images with different styles are selected for five tests.Table Ⅱ illustrates the experimental results.From Table Ⅱ, in this testing platform, there are small differences in the time required for each style transfer test for eight different style images.However, overall, the actual time required for image transfer with different styles is less than 3.5s, indicating a better efficiency in image style transfer.Comparing the proposed method with commonly used methods, the MIOU values of the image style transfer method obtained are illustrated in Table Ⅲ.From Table Ⅲ, the MIOU value of the CNN based image style transfer method is 0.728, and the image transfer effect of this method is the weakest.The MIOU value of CNN-BN is 0.82, FNST is 0.864, and NAAS is 0.858.The difference between these three methods is relatively small.The MIOU value of the ResNet-IN method is the highest, at 0.9, indicating that the image achieved using this method is closest to the original image, and the difference between the two is the smallest.The image style restoration obtained using this method is the best.The style transfer image achieved using this method is closest to the original image, with the smallest difference between the two.The image style restoration obtained using this method is the best.Through testing eight different style images, there are small differences in the time required for each style transfer test.Overall, the actual time required for image transfer with different styles is less than 3.5 seconds, indicating a better efficiency in IST.In summary, the IST model proposed in the study can significantly improve the efficiency and quality of image style transfer, provide support for art and design activities represented by painting, and meet the needs of image style transfer in more practical scenarios.However, there are still shortcomings in the research.There are shortcomings in the flexibility of IST.Therefore, in future research, further research is needed on the flexibility of IST.However, there are still shortcomings in the research.The existing image style transfer methods are mainly influenced by factors such as algorithm execution speed, flexibility, and image quality.Although the proposed method has optimized the execution speed and the quality of image extraction, it still has shortcomings in the continuity of image style features.In addition, this model can only complete the transfer of similar image styles after each completion, which has shortcomings in flexibility methods.Therefore, in future research, based on the continuity of image feature extraction and the flexibility of image conversion, the image style transfer ability will be further optimized.

R
is the style LF.

Fig. 4 .
Fig. 4. The basic process of residual block sampling.

Fig. 5 .
Fig. 5. Image style transfer process based on instance normalization operation.

Fig. 6 .
Fig. 6.Loss value and accuracy rate under different learning rate.

Fig. 7 .Fig. 8 .
Fig. 7. AUC value of three models.The accuracy of several common methods is verified.The changes of Loss function and accuracy of different models are obtained as shown in Fig. 8.In Fig. 8, among the common image style transfer methods, the CNN-BN method has the lowest accuracy, only 82.51%.NAAS has an accuracy of 85.69%.FNST has an accuracy of 91.26%.The ResNet-IN method used in the study has an accuracy of 95.73%.At the same time, considering the loss rates of different methods, the ResNet IN has the best loss value.This indicates that the ResNet-IN image style transfer method has the best performance, which can obtain higher accuracy transfer images.

Fig. 10 .
Fig.10.Accuracy, recall rate and F-measure value of the three models.V. CONCLUSIONImage style transfer can transfer the style features of one painting to another portrait.With the development of artificial intelligence technology, IST technology based on DL has been widely developed.However, existing image style transfer technologies still have issues of low efficiency and low image quality.Therefore, based on the existing CNN, the ResNet structure is introduced to optimize it.At the same time, the normalization idea is used to improve the contrast problem that occurs during image transfer.The IST network Generative model based on the normalized ResNet is constructed.According to the findings, the AUC values of the ResNet-IN model proposed in the study are 0.985 and 0.993 in the training and testing sets, significantly higher than those of commonly used methods.Among several common image style transfer methods, the CNN-BN method has the lowest accuracy, only 82.51%.The accuracy of NAAS is 85.69%.The accuracy of FNST is 91.26%.The accuracy of the ResNet-IN method used in the study was 95.73%.The ResNet-IN image style transfer method proposed in the study has the best performance and can obtain higher accuracy transfer images.From the perspective of image restoration obtained by different style transfer methods, the proposed image style transfer method has the highest MIoU value of 0.9.
Liu et al. proposed a data augmentation method to address the issue of limited and imbalanced sample data for transformer faults.A deep residual network with identified paths was introduced to construct a fault diagnosis model, enabling effective transmission and updating of weight parameters.According to the findings, it can effectively expand the data samples with high similarity to the original data.The ResNet has strong feature extraction ability.It enhances the accuracy of fault diagnosis [7].The existing automatic sleep staging algorithms have too many parameters and long training time, which leads to poor sleep www.ijacsa.thesai.orgstaging efficiency.Therefore, Yun L et al. proposed an automatic sleep staging algorithm depending on Transfer learning for random deep residual networks (TL-SDResNet) using single channel EEG signals.Experiments have shown that this model can complete rapid training of data.The overall performance is superior to other classic algorithms, with certain practical value [8].Yousefzadeh et al. introduced an image retrieval system based on the expanded residual CNN of Triplet loss.This system had fewer parameters and provides acceptable accuracy in top-level retrieval images.

TABLE II .
EFFICIENCY COMPARISON OF DIFFERENT STYLES OF IMAGES

TABLE III .
MIOU COMPARISON OF IMAGES WITH DIFFERENT STYLES The results are illustrated in Fig. 10.The accuracy, recall, and F1 values of the ResNet-IN model are 97.35%,96.49%, and 97.52%, respectively.The three values of NAAS are 95.21%,94.67%, and 95.23%, respectively.The three values of CNN-BN are 96.38%,90.77%, and 93.50%, respectively.The three values of FNST are 95.21%,91.42%, and 93.76%, respectively.The three values of CNN are 90.76%,91.29%, and 94.08%, respectively.Overall, the ResNet-IN image style transfer method constructed in the study has more advantages in the application process, which can more accurately achieve IST and ensure the quality of style images.