Hybrid Decision Support System Framework for Leaf Image Analysis to Improve Crop Productivity

Crop disease is one of the major problems with agriculture in India. Identifying the disease and classifying the type of disease is most important which can be made possible using the deep learning technique. To perform this verified dataset is required which consists of healthy and disease leaf images of all crops. The proposed model uses a hybrid approach which integrates VGG16 classifier with an attention mechanism, transfer learning approach and dropout operation. The proposed model uses a rice disease dataset and using the proposed approach it achieves an train accuracy of 96.45 percent and train loss 0.09 and validation loss of 0.44. The dataset is collected from the plant village project for rice leaf which consists of 4955 images which include Brown Spot, Healthy, Hipsa, and Leaf Blast type of images. The proposed model use attention mechanism that focuses mainly on the part of the image rather than the whole part of the image using a glimpse ratio of 3:1. The traditional method of detecting crop diseases needs high experience and knowledge of experts in the field which is time consuming, ineffective, and high cost. In this study, Deep Convolutional Neural Networks (DCNN) and Transfer Learning with Attention models are used to detect diseases associated with rice plants without overfitting the model. Keywords—Deep learning; activation function; attention mechanism; dropout operation; transfer learning; VGG16


I. INTRODUCTION
Country's economy depends on agriculture. One of the reasons that effect agro-economy is plant disease, which harms the entire crop by spreading the disease throughout the field. So, detection of disease becomes very important. Identifying leaf disease can be made possible with image dataset. The dataset comprises of both diseased and healthy leaf images which are the primary source of data for identifying the disease portion on the leaf. For early detection of leaf disease it is required to monitor the disease, in time treatment of disease by applying pesticide and minimizing the spread of disease and reduce the loss. Testing the model with pre-trained weights for better classification of disease is made using CNN [1]. The proposed approach uses convolution neural (CNN) network which is best approach for image segmentation. DCNN based technique is used to train the VGG16 classifier in order to distinguish healthy and infected rice leaf with multiple layers in CNN model. CNN based VGG classifier uses dense layer with convolution and max pool layer. The dataset consist of four types of leaf images three are diseased leaf images (Brown Spot, Hipsa and Leaf Blast) and healthy images. The preprocessing is done using pre-trained ImageNet weights. These pre-trained images are trained on millions of images with 1000 different categories. Using this pertained model in the proposed approach will reduce the time of the training new images. All Images are of three dimensions i.e., height, width and channels (RGB). Deep learning model requires high GPU. Every convolution layer is followed by ReLu activation function which increases nonlinearity [2]. Designing a rice leaf disease detection model using deep learning algorithms is the main idea of the proposed model which is achieved using hybrid approach VGG16 with attention model. Vijay Kumar V [3] design a robot for monitoring the agricultural field condition such as soil moisture, crop quality, pesticide for good quality crops and supplying required amount of water. The robot is designed using Lattepanda which is Chinese board which runs intel processor integrated with machine learning model and an android application is designed for controlling the robot. The machine learning model integrated with robot performs clustering using mean shift vector and density estimation window is applied for converging to center of maximum dense area. For classification of disease SVM classifier is used which takes the input image labeled data and outputs the optimal solution. The robot comprises of two motors one for driving the robot and other for performing the operation of sensing soil moisture and humidity which is connected to robot. [4] Plant viruses are threat to agriculture productivity. Controlling the disease is based on two aspects i.e., genetic resistance (immunization) and prophylaxis to restrain virus dispersion by removing the infected plants. The paper discuss on the effect of plant viruses and genetic diversity. Plant disease identification methods are based on the plant DNA classified as polymerase chain reaction (PCR) and isothermal amplification. In PCR specific viral region of plant is identified and visualized by electrophoresis. [5] adapting the advanced decision support system, digitized and data driven technology identification of plant disease can be made easy. The paper proposes mathematical model based on deep learning technique. The region proposal network (RPN) technique is used where the images are segmented based on the RPN algorithm results. The segmented images are input into transfer learning the model is trained with various diseases and obtained the accuracy of 83%. The following sections in the paper discusses about the existing study made in this domain followed by methodology adapted for implementation next section shows the proposed VGG16 architecture followed by discussion on results obtained using proposed hybrid approach.

II. LITERATURE SURVEY
The plant disease recognition and classification is done using image processing technique. The paper discuss about noise reduction and image segmentation technique for diseased part of leaf. Banana leaf is considered for disease identification. Noise is introduced in the image by various factors. Some of the main factors that induce noise are environmental conditions, variation in light and sensor temperature, transmission passage. Paper discuss about filtering technique like linear filters, adaptive filters, nonlinear filter for removing noise [6].
The plant disease cause serious effect on countries economy [7]. This paper discusses the algorithm for image segmentation which detects and classifies the type of disease that appears on plant leaf. Digital camera captures the image and those images are used to identify the infected part of leaf. Various image processing techniques are applied on images to analyze the features. Workflow of the model is firstly image need to be acquired and preprocess the image to remove noise and improve the quality of the image. Compute the threshold value for the green colored pixels. Pixel value is compared with threshold value if the pixel value of green component is less than threshold value than zero is assigned to RGB components of the this pixel. Finally obtain the image segments to classify the leaf disease. The author performed experiments using MATLAB. Input is rose, banana, beans, lemon leaf image with bacterial disease.
The image classification is done using k-means clustering with 86.5% accuracy and detection accuracy is improved to 93.3%.
In [8] the proposed model use wavelet tool for image analysis. Maize leaf image is considered for noise removal in image. This noise is introduced in the picture due to variation in light and environmental conditions and also on image acquisition equipment. Adaptive local smoothing method is proposed in this paper. Wavelet tool is used for analysis.
In [9] image enhancement is done using filtering methods like Gaussian filter, Mean filter, median filter and wiener filter. Comparative study done among all filter and wiener filter gave better result with high signal to noise ratio. The proposed model is used to identify the rice plant disease brown spot. In [10], paper use random forest to identify the healthy and diseased leaf for dataset. The proposed model creates dataset and performs preprocessing of the image to bring all images to unified dimension and extract features of the image using Histogram of oriented gradient. The model is trained using Random forest classifier for classifying healthy and diseased leaf. Comparative study is done with Gaussian Naïve bayes, logistic regression, linear discriminant analysis, SVM and random forest has shown better accuracy for smaller dataset. The main objectives of the proposed model is identification of leaf disease using hybrid approach, implemented model must correctly classify the disease and finally evaluation of the performance. The plant is susceptible to disease due to soil quality also and soil health [11]. The frequent change in climate condition and the common practice in agricultural ecosystem is use of extensive pesticides and fertilizers and the effect of abiotic stresses have made the crop to degrade in quality and lead to reduced production [12]. The paper use arbuscular mycorrhizal fungi (AMF) for enhancing crop productivity. AMF are bio-fertilizers and prvides tolerance to the plants against various stressful situations like drought, heat, salinity and varying excessive temperature and weather condition. Nitrogen being one of the essential nutrient for plant productivity which is applied more but it has a negative impact on the plants and environment [13]. Nutritional exchanges between plant, arbuscular mycorrhizal fungi, and bacteria that help improve plant nutrition, including nitrogen (N) acquisition. Plant N acquisition can be improved in the presence of N2-fixing symbiotic and associative symbiotic bacteria and arbuscular mycorrhizal fungi (AMF). [14] ML based techniques have achieved a great attraction in digital image processing and prediction. Tough there are various challenging technologies this paper has come with hybrid approach by enhancing image, conversion of image and removing noise and applying GLCM technique and finally neuro-fuzzy logic classifier is used to train the model and extract features. The model is implemented using MATLAB and the average test accuracy obtained is 90%. The proposed algorithm in this paper uses a-priori information about the shape of plant leaves [15]. The model is compared with stateof-art segmentation technique. The model detects leaf tips to improve the segmentation accuracy of the leaf. The algorithm reduced the error of detecting the leaf tips accurately and increased the detection accuracy. Image processing and soft computing techniques are combined to improve the detection accuracy [16]. Automated plant species is identified in this paper using image data. The SVM and ANN technology is applied on the dataset for plant classification. With 32 different types of leaves the model could achieve the accuracy of 94% [17]. In [18] the hybrid approach is proposed for automatic detection of leaf disease based on CNN and convolutional autoencoder (CAE). This hybrid model could obtain the accuracy of 99%. ANN and CAE is integrated because they efficiently and reduce the image dimensionality and extract various spatial and temporal features from image data. The CAE network is used to reduce the dimensionality of the image and the output of the CAE is network is given as input to CNN model for classification of diseased and healthy plants. The normalized root mean square error is used to reconstruct the loss between original and reconstructed leaf image. This integrated model loss is less compared to other models.

III. PROPOSED SYSTEM ARCHITECTURE
As shown in Fig. 1, the flow of work is as follow, Firstly load the image dataset collected from plant village project and convert the image to grayscale and bring all image to unified dimension and create balanced dataset for all type of images. Generate the array of pixel for each image and normalize the value in the range 0 to 1 and dump in pickle file. Shuffle the dataset and visualize using openCV which is computer vision technique. Now split the dataset into train, test and validation. Train set is 60 percent and test set is 25 percent and validation set is 15 percent. Next train the dataset using VGG16 model using pre-trained ImageNet data rather than training from scratch and define the activation function ReLu after every convolution layer and define softmax function at fully www.ijacsa.thesai.org connected layer. Apply attention glimpse with ratio 3:1 to focus on sub part of image rather than whole image. Than validate the data using validation set and once model is finalized, test using test dataset and make prediction. Accuracy of the model is analyzed using ROC curve. The proposed model achieves accuracy of 96 percent with hybrid approach without any overfit in model.

IV. PROPOSED VGG16 CLASSIFIER
Firstly the images are fed into neural network VGG16 architecture that contains five convolution blocks and three dense layers. First convolution block has 2 convolution layers followed by max pool layer. The image size is 180X180x64 height and width of image is 180X180 and 64 channels. Max pool layer reduce the dimension of the image and doubles the number of channels, so that more appropriate features can be extracted so, the max pool layer reduce the dimension to 90X90X128. Second convolution layer has two convolution blocks followed by max pool layer. The size of the image at this layer is 45X45X256. The third convolution block consists of three convolution layer followed by max pool. The size of the image is now at max pool layer 22X22X512. The fourth convolution block consists of three convolution layer with max pool layer. Max pool layer reduce the dimension of image to 11X11X512 and finally the last block has three convolution layers with max pool layer the dimension of the image becomes 6X6X512. The dropout operation is applied at dense layer which drops out some of the units randomly to avoid overfit in the model. The attention mechanism is applied with the glimpse ratio of 3:1 which focus of part of image rather than whole part of image at one time. Finally the fully connected layer uses softmax activation function which selects the most probable class among n classes. After every layer ReLu activation is used that will forward only positive weights to the next layer as shown in Fig. 2. The dataset considered for classification is shown in Table I. The main idea of using CNN is its advantage in image segmentation. Images can be of any size, it performs better than any traditional algorithms and each image is converted to array of pixel so, pixel wise predictions are done. Fig. 3 shows the grayscale images of various types of leaf disease used in proposed model Brown Spot, Leaf Blast, Healthy and Hipsa. www.ijacsa.thesai.org    Loss Function -Categorical Cross Entropy.

V. RESULT AND DISCUSSION
The approach of transfer learning with pre-trained ImageNet weights is used in this model with Gaussian attention mechanism. Firstly the input layer takes the rice leaf image of size 180x180 and the convolution layer with ReLu activation function extract the low level features of the image and the convolution layer at the end of the VGG16 net extract high level features of the image. The proposed model works well even with small dataset with good accuracy. The hybrid approach is not proposed in state-of-art systems in literature study. In this work the hybrid model is applied to detect rice leaf disease of type Hipsa, Brown spot and leaf blast and healthy.
The proposed approach use transfer learning with ImageNet weights which is trained on millions of images with 1000 categories. In the proposed model four classes are considered with three disease class and one healthy class. The last layer is a fully connected layer with softmax function which identifies the most probable class. The learning rate of the VGG net model is 0.0001. The weights are updated using Adam optimizer and batch size is set to 64 and the model is run for 100 epochs. The accuracy obtained with this classification model is 0.9654 and model loss of 0.09 and validation loss is 0.44 and validation accuracy is 0.81 for 4955 (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 9, 2021 385 | P a g e www.ijacsa.thesai.org images as shown in Fig. 4 and 5. Hence the validation loss is greater than train loss so there is no overfit in the model. The proposed model correctly classifies the images. Fig. 6 represents the ROC curve. ROC curve shows x-axis with (1-specificity) and y-axis showing sensitivity. ROC is calculated on predicted scores and Fig. 7 shows the train loss train accuracy, validation loss and validation accuracy for 50 th epoch. Fig. 7 clearly shows the train accuracy is greater than validation accuracy and loss is less than (<1) hence the model classifies the disease without any overfit with accurate prediction.
The accuracy of the model is cross-validated using confusion matrix as shown in Table II. The values f precision, recall and f-measure which gives the accuracy of the proposed model. Fig. 8 shows the ROC curve for all classes. ROC is showing good accuracy in the proposed model. X-axis shows the false positive rate and y-axis shows the true positive rate.    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 9, 2021 386 | P a g e www.ijacsa.thesai.org

VI. PERFORMANCE ANALYSIS
The VGG 16 model consists of 16 layers with one input layer and output dense layer. The dropout operation in the model randomly drops some of the units with the rate 0.4 to avoid overfit in the model. Fig. 9 shows total number of trainable parameters and non-trainable parameters. The omission of parameters at dense layer will lead to good accuracy without any overfit with 150 epochs. The performance analysis is done by varying various parameters like number of epochs, batch size, and dropout rate as shown in Table III. The performance of the model is evaluated by comparing with transfer learning using pre-trained ImageNet weights and next with dropout technique and finally dropout technique with attention model. The proposed model use five convolution blocks with transfer learning, dropout and attention mechanism hybrid approach has shown good accuracy as shown in Table III. Table IV shows the comparative study of various ML models. The proposed model has shown good accuracy when compared to other models. The SVM does not work well with large dimension data and k-means is more sensitive to outliers and decision tree model is inaccurate because small change in data leads to large change tree structure. The proposed approach VGG16 net is a DCNN classifier which works for data of any size and fast in computation compared to other ML model. Hence, the accuracy is high. Fig. 10 shows the comparative study of various ML models and proposed model accuracy for precision, recall and f-measure with roc curve.

VII. CONCLUSION
In the proposed model disease classification is done using rice leaf image dataset using pre-trained ImageNet weights. The proposed model achieves optimal accuracy. The performance of the model is evaluated by varying number of epochs, weights at every layer and batch size. The hybrid approach used in the model which integrates transfer learning, dropout and attention model helps in achieving high accuracy. The model will be using preprocessing techniques such as binarization of images and segmentation. The model will then be using CNN, Transfer Learning with Attention to classify images into their class labels. Results of the DCNN model will be compared with existing models to check which model has the better classification accuracy. This model when compared with the existing model provides a higher rate of accuracy and correct predictions as we are using a hybrid model. Plant disease classification can help farmers detect the disease in their crop before harvesting period. This allows them time to cure the crop from the disease before it can affect the majority of the crop. The model allows the farmer to get an accurate classification of the disease in order to take the appropriate countermeasures. Since farmers are able to detect and prevent spread of the disease, this helps to boost crop yield and furthermore helps in improving the economy.
In future more real time data can be collected using drones and robots and apply classification technique to get more accurate results and deploying the model in real time in agriculture field which will help farmers from huge loss due to diseases.