An Automated Convolutional Neural Network Based Approach for Paddy Leaf Disease Detection

Bangladesh and India are significant paddycultivation countries in the globe. Paddy is the key producing crop in Bangladesh. In the last 11 years, the part of agriculture in Bangladesh's Gross Domestic Product (GDP) was contributing about 15.08 percent. But unfortunately, the farmers who are working so hard to grow this crop, have to face huge losses because of crop damages caused by various diseases of paddy. There are approximately more than 30 diseases of paddy leaf and among them, about 7-8 diseases are quite common in Bangladesh. Paddy leaf diseases like Brown Spot Disease, Blast Disease, Bacterial Leaf Blight, etc. are very well known and most affecting one among different paddy leaf diseases. These diseases are hampering the growth and productivity of paddy plants which can lead to great ecological and economical losses. If these diseases can be detected at an early stage with great accuracy and in a short time, then the damages to the crops can be greatly reduced and the losses of the farmers can be prevented. This paper has worked on 4 types of diseases and one healthy leaf class of the paddy. The main goal of this paper is to provide the best results for paddy leaf disease detection through an automated detection approach with the deep learning CNN models that can achieve the highest accuracy instead of the traditional lengthy manual disease detection process where the accuracy is also greatly questionable. It has analyzed four models such as VGG-19, Inception-Resnet-V2, ResNet-101, Xception, and achieved better accuracy from Inception-ResNet-V2 is 92.68%. Keywords—Paddy leaf disease; deep convolutional neural network (DNN); transfer learning; VGG-19; ResNet-101; Inception-ResNet-V2; Xception


I. INTRODUCTION
Toward the start of the 21st century, paddy (Oryza sativa species) is as yet the main oat in human food frameworks and the principle wellspring of energy and a critical portion of proteins devoured by very nearly three billion peoples [1]. More than 90% of the world's paddy is produced in the Asia-Pacific Region [2]. In Bangladesh, paddy is the key producing crop food, about 75% of the absolute edited region, and over 80% of the all-out watered zone is planted to rice. As a result, paddy plays an important role in the subsistence of the people of Bangladesh [3].
Most of the time farmers have to face various problems in paddy cultivation such as Damage to arable land, increased population, climate change, pests, and diseases, etc. Due to these various problems, farmers are becoming uninterested in paddy cultivation nowadays. This paper has focused only on the pests and diseases to the various problems of rice cultivation. There are three main types of paddy diseases such as bacterial disease, fungal disease, and miscellaneous diseases. These include subcategories like bacterial blight, bacterial leaf streak, brown spot, leaf smut, leaf scald, panicle blight, bronzing, etc. [4]. Note that the incidence of diseases has of late gotten extreme because of the unfavorable impacts of climate change, especially the ascent in temperature (IPCC, 2007). It is assessed that 4-14% of rice yield in Bangladesh is lost each year by various pests and diseases. Bacterial leaf curse (BLB) and brown spot are currently genuine infections in rice. But the innovations technologies to pests and diseases are still restricted [5].
Generally, the manual recognition of paddy disease is the unaided eye perception of specialists which burns-through additional time, costly on huge homesteads [6]. It is hard to measure and some of the time it delivers a mistake while distinguishing the disease type [7]. Because of the ignorance of appropriate administration to redress paddy plant leaf disease, paddy production is being decreased as of late [8]. To overcome this, appropriate and quick recognition measures are required for the diagnosis of paddy leaf diseases. This work mainly focused on the five most common paddy leaf diseases named Brown spot, healthy leaf, leaf blast, bacterial blight, leaf smut.
The Revolution of Artificial Intelligence has made it easier to maintain a standard of living. Like all other sectors, there is no shortage of AI contributions in the agriculture sector. Technology has made it much easier to solve many problems in agriculture, plant disease is one of them. Currently, it can do a lot of disease detection using machine learning and deep learning. Despite some limitations, it has largely succeeded. As a result, the farmer himself can detect paddy disease in his land without the help of an expert. Technology is going to bring many more revolutions in the agriculture sector in the future. 281 | P a g e www.ijacsa.thesai.org According to a survey conducted in 1979-1981, 20 diseases of paddy have been reported in Bangladesh were to exist paddy leaf diseases [9], among which 13 diseases were identified as the important ones. In 2019 according to the rice knowledge bank of Bangladesh bacterial leaf blast is one of the most deleterious diseases. In Bangladesh, leaf blast, leaf blight, brown spots are very common diseases in paddy cultivation.
In this paper, it focused on four paddy diseases as Brown Spot, Leaf Blight, Leaf Smut, Bacterial Leaf Blast, and one healthy leaf. This paper selected the Deep Convolutional Neural Network and trained the dataset on the four DNN based pre-trained models named VGG-19, Xception, Inception-Resnet-V2, and Resnet-101.

II. RELATED WORK
Earlier, many studies have been done on various diseases of rice. At present research is being done on various diseases of rice and its cure. Using machine learning techniques by Kawcher Ahmed, et al [10] to detect 3 paddy leaf diseases. They mainly focused on three major leaf diseases of paddy, to complete their work and achieving better accuracy used four machine learning models and 10-fold cross-validation techniques.
Milon Biswas et al. [11] worked on only three paddy diseases and applied one classifier. Firstly, take images, convert to grayscale, image segmentation, apply SVM classifier, and finally predict the result.
Wen-Liang Chen, et al. [12] bacterial blast leaf disease is one of the most paddy diseases. Using the Internet of Things and Artificial Intelligence Technologies they mainly focus on agriculture sensors generating non-image data that can be automatically trained and analyzed by the AI mechanism in real-time. They can detect plant diseases almost efficiently.
Using an Optimized Deep Neural network with Jaya algorithm by S. Ramesh, et al. [13] mainly focuses on recognition and Classification of paddy Leaf diseases. They worked on four paddy diseases like bacterial blight, brown spot, sheath rot, and blast.
At present farmers are facing a lot of losses due to various diseases of paddy. Eusebio L. Mique, Jr. et al. [14] mainly focused on how to measure and control different types of paddy diseases easily using Convolutional Neural Network (CNN) and image processing. Data collected from internet sources and manually captured.
David F. Nettleton et al. [15] compared four models two are operational process-based and two are methods based on machine learning algorithms. They mainly focused on only one plant disease (leaf blast) and details describe it. Processbased and data-driven models can be utilized to give early alerts to envision rice blast and find out its quality, subsequently supporting fungicide applications.
So far a lot of work has been done or is being done on paddy disease detection using AI technology. Jay Prakash Singh et al. [16] focused on how to detect and classify paddy disease using modern image processing and machine learning techniques. They complete their work in four stages like image preprocessing, segmentation, feature extraction, and classification. It's a review analysis based research paper. They have tried to figure out how to better detect rice leaf diseases from various techniques.
When farmers apply pesticides on the land to eradicate various diseases of paddy, it is seen that they have many problems in understanding the severity of the disease or it is very difficult to do it manually. As a result, they apply more pesticides than they need. To solve this problem, Prabira Kumar Sethy et al. [17] has developed a prototype that measures the severity of various diseases in paddy and tells how much pesticide is needed. To develop this prototype they used fuzzy logic of computational intelligence and segmentation techniques of machine learning. Computational Intelligence is a subpart of Artificial Intelligence. They have focused on how to reduce the use of pesticides to reduce pollution.
S. Ramesh et al. [18] proposed a mechanism for rice blast leaf disease detection using KNN and ANN algorithms. They mainly focused on Indian rice crops, one rice leaf disease, and how to detect disease in its early stages. They achieved the best accuracy from ANN is 99%.
Early and proper recognition of any kind of plant disease is an essential step in grain protection. Vimal K. Shrivastava et al. [19] focused on how to solve traditional plant disease detection systems. They work on four classes, three on diseases and one on healthy leaves. To complete their work they used a pre-trained deep CNN model (AlexNet), SVM classifier, and transfer learning, achieving their accuracy of 91.37%.
Dengshan Li et al. [20] proposed a mechanism that detects rice leaf disease from real-time video using deep learning techniques. They used faster-RCNN for image detection from video and also used various deep CNN models like VGG16, ResNet-50, ResNet-101, and YOLOv3.
Gittaly Dhingra et al. [21] mainly focused comprehensive study on various paddy disease detection and classification using image processing techniques. They discussed two issues of rice disease detection and classification.
Junde Chen et al. [34] study five paddy leaf disease using deep learning approach with transfer learning. They used two deep learning models like the Dense-Net and Inception module and achieved accuracy 98.63%.
In Table I, showed many scopes on paddy disease research. This study works on five classes with four diseases and one healthy leaf.

Author Information Limitations
Kawcher Ahmed, et al. [10] They have determined that they will work with high-quality datasets in the future and will focus on how to achieve better accuracy using more advanced models.
Milon Biswas et al. [11] They clarify that their data values are very less in the dataset, they used three paddy diseases and their data values are only 30 images, as a result, they depend on assumption when measuring the performance.
Prabira Kumar Sethy et al. [17] They worked on just four types of paddy diseases. They will work with other diseases of paddy in the future.
S. Ramesh et al. [18] They worked on just one rice leaf disease (leaf blast). In the future, they will work on other's rice leaf disease or other crops.
Wen-Liang Chen, et al. [12] They just focused on only one rice leaf disease Vimal K. Shrivastava et al. [19] They declared that their proposed model would give better results if the dataset could be enlarged.
Gittaly Dhingra et al. [21] The proposed model can be further customized when the two diseases need to be identified and classified together. For achieving better accuracy increase the number of data in the dataset and develop advanced algorithms. For instant solutions can be made mobilebased applications.
Dengshan Li et al. [20] They declared their proposed system could be applied to other rice disease and pests S. Ramesh et al. [18] In the future, to enhance the detection and classification of paddy diseases, any improvement method can be used to get the best performance by decreasing the false prediction.

III. PROPOSED SYSTEM
Previously, much work has been done on paddy disease detection using machine learning and deep learning concepts using different systems. This paper followed a benchmarked approach with a customized deep learning model shown in the flowchart (Fig. 1). After the acquisition of infected paddy leaf images, the image preprocessing term took part. The preprocessed images go through the deep convolutional neural network. Convolutional blocks of the models extract the main features from the input images. Based on the features of the images, the DNN model initiates the weights of each node. The final dense layer of the model contains five neural nodes and the activation function like softmax helps to predict the class of the given data.
The machine inputs the image from the dataset, preprocessing the images like rotates, zoom, flip, shuffle, resize the images. This will apply four deep CNN models: VGG-19, ResNet-101, Inception-Res-NetV2 and Xception where main focused on feature extraction and classification. Finally, predict the result by the best model. IV. SYSTEM OPERATION This is quantitative applied research based on the deep learning concept. In this section, discuss the methods taken part in this research.

A. Feature Extraction and Segmentation using CNN
Convolutional neural networks (CNN) is a feed-forward artificial neural network [22]. It has convolutional layers that have taken the role of feature extraction [23] shown in Fig. 2. Artificial neural network (ANN) based fully connected layers follow the classification process in a model. Fully connected layers contain multiple nodes and each node is connected to all of the next nodes of the next FC layer. Working with small size visual data, needed less number of neural nodes and in that case, can use only fully-connected layer blocks. In the case of a large image, more parameters needed to execute the process with an artificial neural network [25]. CNN contains neural nodes connected to a small region of neurons of the next convolutional layer. It compares the given visual data with a specific part by part. This specific part is called a feature of the image [24].
The convolutional layer at first lines up the feature from the input images and multiply each input image pixel by the corresponding feature pixel. Then perform summation of the pixel values and divide by the number of the total pixel in the feature. The calculated values are put in the feature map and move the filter throughout the entire image. All the calculated values are reserved in the feature map. In this way, all features go through the process and generate different feature maps. The equation (1) to obtain the convolutional layer is the following, Where bias is commonly set as which does not depend on the position of the pixel of the image.
as an identical value of weight.
Activation function Rectified linear Unit (ReLu) taken part now and remove all negative values from the feature map and replace it with zero. The activation ReLu function formula is shown in Eq. (2), In the pooling layer part, Max pooling layer shrinks the input image size by pooling the maximum value from the feature map, generated by the convolutional layer. The obtained equation (3) of max-pooling layer, = Here, define a set of pixels including the area. A pixel value, is gained by using pcs of pixel value with every channels.
Finally, the fully connected layer converts the shrink images that come from the last pooling layer of the model, converting them to a single list array vector. The classification task is executed in the fully connected layer.

B. Classification based on Transfer Learning
Transfer learning in the machine learning field is a concept where the gained knowledge is transferred to another model to solve another related problem [27]. Deep CNN based applications of Keras are trained with the ImageNet dataset. ImageNet project which is a large visual database design for visual object recognition research. Deep convolutional neural network-based models are trained with millions of images with thousands of classes [26].
Keras deep learning applications contain multiple convolutional, pooling, and dense layers. The architectures can be separated into multiple blocks of layers shown in Fig. 3, 4 and 5. Convolutional blocks of the network contain multiple convolutional layers that extract features from the input data. Fig. 4 showed a residual inception block [32], a convenient design of convolutional layers. Each inception block followed by a filter expansion layer (1x1 Conv Linear) which was used for scaling up the dimensionality of the filter before the concatenation, shown in Fig. 4.
The parameters of features gained by the model are transferable. Using the pre-trained weights in a new model can solve related problems more effectively than general models [27]. The final block of the Keras applications contains dense layers for classification tasks.
Keras deep learning network architectures are trained with thousands of categories of images and the final dense layer contains thousands of nodes to classify all the categories. Cutting down the top layer of the model and adding a customized fully connected layer to classify the desired classes of images is a novel way when the dataset contains a limited number of data [28].

A. Disease Types
Many food grains are wasted only due to insects and diseases. Research is being done all over the world to eradicate these diseases of rice. Although there were more than 30 rice diseases [9] in total, according to the 1979-81 survey [9], 20 rice diseases were reported in Bangladesh. But 13 diseases are common in all three seasons (Boro, Aus, and Aman), respectively bacterial blight, bacterial leaf streak, sheath blight, sheath rot, leaf blast, brown spot, grain spot, stem rot, leaf scald were as major, Tungro, Bakanae, Cercospora leaf spot, and zinc-deficiency were as minor. Only the diseases that have been dealt with are briefly discussed in section B.

B. Dataset Descriptions
For this work, it has chosen four paddy leaf diseases named brown spot, leaf blast, leaf blight, leaf smut, and one healthy leaf. The dataset contains 984 images. Collected the data from various internet sources: UCI machine learning repository [29] and Kaggle [30]. Table II describes the number of data in the dataset and splits the data for train, validation, and test in detail. Fig. 6 and Fig. 7, describes the classes of the dataset in detail.

1) Brown spot:
It is one of the most common diseases of paddy leaf which is caused by fungus. At the beginning stage, round, small, dark brown to purple-brown marks can be seen Fig. 6(a). As time passes, big spots on the leaves will increase and can kill the whole leaf.
2) Leaf blast: This paddy leaf disease is caused by Magnaporthe oryzae which is a kind of fungus. The primary symptoms of this disease are spots of white to grey-green colour which are spindle-shaped with dark red to brownish borders see Fig. 6(b).

3) Leaf blight:
This blight disease is a result of being affected by Xanthomonas oryzae which is one kind of bacteria. Infected leaves turn greyish green and followed by yellowing and then it turns straw-coloured and finally the leaf dies as shown in Fig. 6(c).

4) Leaf smut:
It is caused by the fungus named Entyloma oryzae, which is a widely distributed disease of paddy leaf. The infected leaf will have angular, black spots (sori) on both sides of the leaves seen in Fig. 6(d). The black spots on the leaves are about 0.5 to 5.0 millimetres long and 0.5 to 1.5 millimetres wide. 5) Healthy leaf: A healthy paddy leaf will simply be free from every kind of disease. There should not be any sign of diseases and the colour of the leaf should be green.

C. Data Preprocessing
For data preprocessing the Keras ImageDataGenerator function took part. The dataset contains 984 images with three colour channels with different pixel values. Then resize all the images in 256*256-pixel value. Inconstantly rotating the training images in a range of 15 degrees provides the different viewpoint of the visual object. Width and height shift range, zoom and shear range fixed at 0.1. Rescale images is the only common preprocessing technique taken part in both training and testing dataset. For the training dataset batch size is eight and for the test set, it is taken one.   (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 1, 2021 285 | P a g e www.ijacsa.thesai.org VI. MODEL DESCRIPTIONS Deep Convolutional Neural Network: Deep CNN models are a kind of feed-forward neural network which are used to adjust the parameters of the network to reduce the value of the cost function. These models have significantly worked in the field of analyzing visual imagery, image or video classification, object detection, natural language processing. THE deep CNN model is based on convolutional neural networks which contain a larger number of layers than a general CNN model. The model generally consists of convolutional layers, activation layers, pooling layers, flatten layer, dropout, batch normalization layers, and dense layers. The dropout is utilized to avoid overfitting. Activation functions like ReLu, Softmax are added to network layers to help the complex patterns exist in the data. Different deep CNN models have shown in Table V that the deep CNN model has great performance for classification and detection. The four models were selected based on the variety of architectural design and depth size of them. In Table III, the Top 5 accuracy refers to the validation accuracy with the ImageNet validation dataset and depth refers to the topological depth of the network.

A. Result
This study has worked with four CNN deep learning Keras pre-trained algorithms as mentioned in Table IV to classify and detect the leaf diseases. While analyzing these algorithms, the study found that Inception-ResNet-V2 has obtained the highest accuracy among them which is 0.9286, and similarly, for precision, recall, and F1 score, Inception-ResNet-V2 was ahead of all of them. After Inception-ResNet-V2, Resnet-101 has shown the accuracy of 0.9152. The Xception model has achieved an accuracy of 0.8942 and VGG-19 has the lowest accuracy of 0.8143. Not just in the accuracy but also in the part of precision, recall, and F1 Score it has obtained the lowest results. The number of epochs considered 100 for all training procedures. All of this information is presented in Table III which contains a statistical analysis of these various models. The evaluations metrics (accuracy, precision, recall, f1 score) are defined as following [18]. 1) Accuracy: Accuracy is calculated from the Confusion matrix. Accuracy is the most instinctive performance measure and it is a general ratio of the correctly predicted data to all the data in the dataset [33]. The formula of accuracy is shown in the Eq. (4) Better accuracy is possible only when the values of false positive and false negative are almost the same in the dataset.

Accuracy = (4)
2) Precision: The precision measurement of the algorithms refers to the ratio of correctly predicted positive values to the total number of positive predicted values [33]. The formula for precision is shown in Equation (5).
3) Recall: The recall is the proportion of exactly predicted positive values to the positive actual class in the confusion matrix. Recall calculated formula is given in Eq. (6).

Recall = (6)
4) F1 Score: F1 Score is the weighted mean value of Precision and Recall. That's why this score considers both false positives and false negatives values [33]. In equation (7), the formula of the F1 score is shown. Actually, it is not easy to understand by looking at the accuracy, but F1 is generally more helpful than accuracy. F1 Score = 2 Here, TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative.
As the model Inception-ResNet-V2 has achieved the highest accuracy compared to the other three models, it has created a performance table for this particular algorithm for every class of the dataset and noted the precision, recall, and F1 score of each class. As it can see in Table IV, for the precision, Brown Spot, Leaf Blast, Leaf Blight, Leaf Smut have obtained more than 0.90 but for healthy leaf images, it falls to 0.78. For Recall, for the class leaf blast is 0.71 and for the other classes, it was above 0.90. And finally, for the F1 score, Leaf blast was the lowest which is 0.82 and for Leaf smut class, precision, recall, and F1 score, it is 1.0. All of this information and some other details are presented in Table V.

B. Error Analysis
Although it is difficult to detect disease manually, technology has made the task much easier for us. But even then technology can't always give perfect results like humans, some limitations remain. The machine sometimes gets confused when it comes to disease detection. After choosing the best model some error has. In Fig. 9, it can see 8 data conflicts between Leaf Blast with Healthy Leaf and 2 data Brown Spot with Leaf Blast and Leaf Blight. Although the number is low.

VIII. COMPARATIVE ANALYSIS
In this section, it will discuss different methods of identifying and classifying diseases of paddy and its leaves, which can diagnose rice in different ways using different tools and technologies of machine learning and deep learning. Much research has been done on this before and is still ongoing. Some comparisons of previous research are shown in Table VI.
In previous work, it has seen that many researchers worked on one or two or three or four diseases of paddy using machine learning algorithms or deep learning models or computational intelligence concepts. This study worked on four different diseases (leaf smut, leaf blast, bacterial leaf blight, and brown spot) and one healthy leaf using advanced transfer learning-based deep CNN models.

IX. ADVANTAGES
Previously much work has been done on various paddy leaf disease detection and classification using machine learning and deep learning approaches. Since rice is the staple food of most countries in the world, these studies can be used for the development of agricultural sectors in different countries. Since this study focused on four paddy leaf diseases in the perspective of Bangladesh and worked using deep learning models with transfer learning and achieved better accuracy, this study is more helpful for Bangladeshi farmers to easily detect rice leaves. www.ijacsa.thesai.org X. CONCLUSION Conducting this study, it has evaluated the performance of four benchmark deep learning network architecture and analyzed them in different statistical measures. By analyzing the algorithm's accuracy, precision, recall, and F1 score, the highest achieved a test accuracy of 92.68% from the Inception-ResNet-V2 network architecture. In this paper, the data used for model training and testing collected from different internet sources and local paddy firms. The dataset consists of five classes where four different classes contain four widely infected paddy leaf disease and one class of health leaf images. The unique architecture of the Inception-ResNet-V2 consists of a stem, reduction and inception-resnet blocks with a depth of 571 impacts more than other networks to adapt with Dataset. The ResNet-101 network achieved the secondhighest testing accuracy of 91.52%. To achieve a more accurate prediction of the paddy leaf diseases, it was used to transfer learning approaches. This adaptation of transfer learning increased the accuracy and reduced the model training time complexity.

XI. FUTURE WORK
This research can be carried forward with more varieties of paddy leaf diseases and more fine-tuned CNN models with the expectation of finding better accuracy and ensuring faster detection. A detailed comprehensive study is a must to understand the factors affecting the detection of plant diseases, like the classes' datasets, and size of datasets, learning rate, illumination, etc. The basic form of paddy plant diseases changes with the passage of time or the background of the images, images with colour issues, hence, these convolutional neural network models should be modified to enable them to detect and classify diseases during these complex or problematic situations. This study can be extended by considering other types of paddy leaf diseases, with larger data sets and other CNN models can be analyzed too. Shortly, the work will be done along all of these limitations and use this research as a base to detect other plant leaf diseases with greater accuracy. Also, the highest achieved accuracy of Inception-ResNet-V2 is quite motivating for us to explore more about this model and compare it to other CNN models.