Carrot Disease Recognition using Deep Learning Approach for Sustainable Agriculture

Carrot is a fast-growing and nutritious vegetable cultivated throughout the world for its edible roots. The farmers are still learning the scientific methods of carrot production worldwide. For the production of good quality carrots, modern technology is not being used to its fullest to detect carrot vegetable diseases in the farms. As a result, the farmers face difficulties now and then in continuous monitoring and detecting defects in carrot crops. Hence, this paper proposes an efficient carrot disease identification and classification method using a deep learning approach, especially Convolutional Neural Network (CNN). In this research, five different carrot diseases including healthy carrots have been examined and experimented with four different pretrained models of CNN architecture, i.e., VGG16, VGG19, MobileNet, and Inception v3. Among the four models, the Inception v3 model is selected as an efficient pretrained CNN architecture to build an effective and robust system. The Inception v3 based system proposed here takes carrot images as input and examines whether they are healthy or infected, and provides output accordingly. To train and evaluate the system, a robust dataset is used, which consists of original and synthetic data. In the Fully Connected Neural Network (FCNN), dropout is used to solve the problem of overfitting as well as to improve the accuracy of the system. The accuracy achieved from the method which uses Inception v3 is 97.4%, which is undoubtedly helpful for the farmers to identify carrot disease and maximize their benefits to establish sustainable agriculture. Keywords—Deep learning; convolutional neural network; Inception v3; carrot disease recognition


I. INTRODUCTION
A very powerful concept that can alleviate extreme poverty is development in the agricultural sector. Many countries of the world are dependent on agriculture for their economic advancement. In the year 2018, 4% of the global GDP was added from the agricultural sector. In some developing countries, the amount of GDP that comes from agriculture is over 25% [1]. It is a livelihood for the vast majority of the population and directly affects the overall economy of many developing countries like Bangladesh, India, Pakistan, etc. As for example, in 2019, the share of Bangladesh, Pakistan, India and Morocco's GDP from agriculture was 12.68% [2], 22.04% [3], 15.96% [4] and 11.38% [5] respectively. In agriculture, especially for growing vegetables, early detection and classification of diseases allow the farmers to take preventive measures and reduce production loss as well as economic loss to ensure sustainability.
Experts have been identifying vegetable and fruit disease by naked eye observation over the last decades. However, this approach proves ineffective in many cases as it takes a lot of time to process, and experts are unavailable in rural areas. Since diseases leave some visible symptoms on the body of vegetables or fruits, it is possible to perform an imaging analysis of those visible defects on vegetables. Hence, Deep learning methods provide a solution to this issue.
Carrot (Daucus carota) is a vegetable that is rich in beta carotene, fiber, potassium, antioxidants, minerals, and vitamin k1. Frequent consumption of carrots decreases the risk of breast cancer [6]. It helps to prevent vitamin A deficiency and reduce heart diseases. Carrot is described as a functional food that provides benefits beyond essential nutrition, and it is even good for the eyes [7]. People living in rural areas of least developed countries face health problems due to nutritional deficiency, including vitamin A deficiency. Consumption of carrots can greatly contribute to a country's nutritional security where the standard of diets is relatively poor. Currently, carrot farmers mainly depend on human disease identification capability and experience to identify diseases that may produce inaccurate results. Hence it is indispensable to develop a fully automated carrot disease identification system.
In this work, we have discussed a solution based on deep learning approach that can correctly recognize the disease of carrots. We have experimented with five of the common carrot diseases, such as Black rot, Aster yellow, Sclerotinia rot, Root knot, and Growth crack. We have experimented with some pretrained models which include MobileNet CNN model [8], VGG16 [9], VGG19 [9], and Inception v3 [10] with the original and synthetic dataset. Finally, we have used Inception v3 in the proposed system because of the effectiveness of transfer learning for generating an accurate model in case of limited dataset. Our proposed system takes an image of carrot as input, the CNN layer automatically extracts features from that image, and Fully Connected Neural Network (FCNN) layer predicts the desired class it belongs to. The whole system was implemented using Python, TensorFlow [11], and Keras [12]. The main contribution of this research work is summarized as follows: • We have proposed an efficient method to identify carrot diseases after analyzing images of infected carrots.
• We have expanded our dataset by generating synthetic data and applying some image processing techniques.
• We have experimented with four different CNN architectures and investigated the classification and detection results of carrot diseases.
• The accuracy rate achieved in this study outperforms the existing works in this domain.
• This work will provide assistance to detect carrot diseases earlier and solve the problems of farmers by taking measures to cure the diseases timely.
The organization of the rest of the paper is as follows: Section II contains a literature review of previous works and explains the major carrot diseases for recognition in this work. The system architecture is illustrated in Section III. In Section IV the research methodology is described, Section V shows the overall result, and Section VI concludes the work including future works.

II. BACKGROUND AND LITERATURE REVIEW
In this section we have discussed about six prominent diseases of carrot and then outlined the literature review. One of the main factors for deep learning method is the dataset. As we are working with carrot disease, we have selected six types of data, including a healthy carrot class and five other different diseases of carrot named as black rot, root knot, sclerotinia rot, growth crack, and aster yellow, as shown in Fig. 1. A short description of these diseases is as below:

A. Carrot Diseases Considered for Recognition
(i) Aster Yellow: Aster yellow is a common disease in carrots caused by Mycoplasma. The affected carrots show symptoms such as yellowing leaves and the carrot tap root produces lots of fibrous side roots, tap roots become excessive hairy, tapered, and pale in color. (ii) Black Rot: Alternaria radicina fungus causes Black rot in carrots [13]. This disease is mainly carried by seed and soil and is specified by a black decay at the crown area, which is shiny. (iii) Growth Crack: Growth crack in carrots is caused by soil moisture. The carrot roots split in length. (iv) Sclerotinia Rot: Sclerotinia rot, also known as white mold, is caused by Sclerotinia sclerotiorum fungus. The symptoms include fluffy white mycelial growth on the watery rot and black-colored sclerotia on the crown of the infected carrots. (v) Root Knot: Root knot is generally caused by Meloidogyne hapla nematode. Forking, galls, hairiness, and stubby roots can be observed in the infected carrots. Very often, multiple tap roots form or roots are malformed.

B. Related Work
Several machine learning and deep learning approaches were developed through many years of research for object detection. But a few of them have been applied for detection of fruits and vegetables particularly focused on carrots.
A. Majumder et al. [14] developed a carrot disease recognition system using machine learning methods. In their work, they segmented the disease-affected region using k-means clustering algorithm. Then, they performed classification using Support Vector Machine (SVM) classifier. They used a total of 202 images and 11 feature sets in their work. This approach returned 96% accuracy. However, they have stated that if the quality of images is poor and has varying background color, it can cause distraction to provide results accurately.
S. Sasirekha et al. [15] described image processing techniques to identify and classify carrot vegetable disease. First, they converted the images from RGB to L*a*b color space. Then, they performed k-means clustering technique for segmentation purpose. A total of 13 features were extracted using texture and classification techniques. Then classification was done by multiclass SVM to identify the diseases of carrots.
G. C. Khadabadi et al. [16] used Probabilistic Neural Network (PNN) based classifier to recognize and identify disease in carrot vegetable. They applied Discrete Wavelet Transform (DWT) to extract features. Their proposed system gave an accuracy of 88.0% to classify carrot diseases.
H. Zhu et al. [17] proposed a carrot appearance quality identification based on deep learning. They have utilized AlexNet to extract features from carrot images and to identify carrot quality. The accuracy rate achieved in this work for binary classification recognition was 98.70%.
Rupali Saha et al. [18] proposed a method for the recognition of orange fruit disease involving a deep learning technique. They performed classification using CNN and used 8 feature set where the dataset size was 68. They claimed an accuracy of 93.21% in their proposed approach.
M. T. Habib et al. [19] applied machine learning method for detecting disease on papaya. First, they segmented the disease-attacked region using k-means clustering and performed classification using SVM. They used 10 feature set and the dataset size is 126. This approach returned an accuracy of 90.15%. L. J. Rozario et al. [20] presented a method that involves identification of defective parts of fruits and vegetables. They experimented with four types of fruits and vegetables. They used modified k-means clustering and Otsu method for colorbased segmentation of images. In this work, they used a total of 63 images. No classification was performed in this work. S. A. Gaikwad et al. [21] presented a fruit disease detection and classification method based on image processing techniques. First, the images were segmented using k-means clustering algorithm. In the next step, some features were extracted from the segmented images. Finally, SVM classifier was used for classification purpose.
B. J. Samajpati et al. [22] experimented on three types of common apple diseases. They performed feature-level fusion. In total 13 features were extracted from every image. k-means clustering was applied for image segmentation. Random forest classifier was used for disease classification. They used a total of 80 images in their work. Their total accuracy ranged from 60-100% because of using various fusion of features. They did not carry out any performance analysis of the obtained results rather they showed only the differing accuracy.
M. Islam et al. in [23] proposed an integrated method combining image processing and machine learning technique for potato plant disease detection. Region of interest containing disease symptoms in the images were extracted in L*a*b* color spaces. In this work, 10 features were extracted. Multiclass SVM classifier was used to classify diseases over 300 images which gave an accuracy of 95%.
T. T. Mim et al. [24] worked on sponge gourd leaf and flower disease detection using CNN and image processing techniques. They used a very popular pretrained model called AlexNet for detecting diseases. Their proposed model architecture consisted of multiple layers of convolutional 2D layer and multiple ReLU activation function. They achieved 81.52% training accuracy and a loss of 0.5715 after thirty successful epochs.
M. T. Habib et al. [25] aimed to recognize jackfruit disease applying a machine vision based agromedical expert system. A total of 480 images were used in their work. First, they converted RGB images into L*a*b color space. Then k-means clustering algorithm was applied for image segmentation. They extracted 10 features from the images. Then they experimented with nine different classifiers for comparison purpose. Among all other classifiers, the random forest classifier produced 89.52% accuracy, which was the highest.
S. K. Behara et al. [26] presented a machine vision-based system to identify and measure the severity of disease in orange fruits. First, they separated the background, foreground, and defected region from the sample images using k-means clustering method. Then a total of 13 texture features were extracted from the defected region of images. After that, the extracted features were fed as input into the multiclass SVM classifier. Finally, a fuzzy logic model was used to compute the severity of disease. This work gave an accuracy of 90%.
Most of the existing works in the field of agriculture have been done with traditional machine learning. In most of the cases the authors have used separate segmentation algorithms and classifiers to extract features. However, no significant work has been done using the deep learning approach. In the proposed system, VGG16, VGG19, MobileNet, and Inception v3 models were used to train and test the infected carrot images among which the Inception v3 model outperformed the rest of the models.

III. SYSTEM ARCHITECTURE
The structural design of the proposed deep neural network based system for detecting and classifying carrot disease is shown in Fig. 2. At first, the farmer has to take an image of the infected carrot using a mobile phone or any other camera device. Then the image needs to be sent to the expert system via the internet. The expert system will analyze the image and compare it with the previously trained dataset to check if the carrot in the image is disease infected or not. If the carrot sample is disease infected, the system will identify the disease. Finally, the system will show the predicted disease name as output as well as give expert solutions from database.

IV. RESEARCH METHODOLOGY
In order to implement a carrot disease recognition system, a deep learning model has been built which is described in this section. Fig. 3 shows the detailed steps of building the framework for the system.

A. Image Acquisition and Preprocessing
Deep learning approaches need a large dataset to achieve more accuracy. In this research, a dataset of total 10,655 images are used for training including original, synthesized, and augmented data. A dataset of 480 original images has been created which includes 80 images per class (five disease classes and one healthy class). Some preprocessing techniques have been performed to increase the training dataset size. Another 1651 synthesized image dataset has been generated for six classes by changing the background of the original images. Synthesized data can be used and has a positive effect on training [28]. Data augmentation using TensorFlow and Keras is also applied in five different ways. The overall distribution of the dataset is shown in Table I. Both synthesized data and augmented data helped us to avoid overfitting. Before feeding the training images into the neural network, all the input images are resized (300x300) and rescaled (1/255). Then data augmentation is applied using imageDataGenerator [27] from TensorFlow. Finally, the dataset is ready for the training session.

B. Training of Images
We have experimented with some pretrained models, that includes MobileNet CNN model [8], VGG16 [9], VGG19 [9] and Inception v3 [10]. As the Inception v3 model provides  better performance than the other models, the rest of this paper focuses on this model. However, the detailed comparison results are discussed in Section V. The FCNN Layers have been configured as shown in Fig. 4 to adapt the proposed system.
In the fully connected layer in FCNN, dropout [28] has been used to avoid overfitting. The last layer (the output layer) consists of six hidden units with the Softmax activation function. This function is used when we deal with multiclassification problems. It takes a value as an input and transforms it into a probability distribution whose total sum is 1.
The Softmax activation function is expressed as: We used ReLu activation function in other layers. ReLu function can be expressed as: In this proposed system, RMSprop optimizer has been used as the optimization function and categorical cross-entropy has been used as the loss function. The learning rate has been set to 0.0001.

C. Output Generation
In this research, a total of 233 test images related to six classes have been used. Test images have been converted to the same form as training images by resizing and rescaling the images and fed into the proposed model. Next, the outputs of the  test images have been collected. To measure the performance of the classification system we need to compute True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN). These four cases of classification results can be represented by a confusion matrix. After testing and training the dataset, a 6 × 6 confusion matrix has been generated, which represents the correlation between the label and model's classification, that is, how successfully a classification model can predict.
Next, we have computed important performance metrics, i.e., accuracy, precision, recall, specificity, false positive rate (FPR), false negative rate (FNR), F-1 score from the confusion matrix so that the overall performance of the proposed system can be evaluated. The formula to calculate the performance metrics are given below: V. RESULTS AND DISCUSSION The experimental analysis of the proposed system, modelwise analysis, and comparative analysis with other related works have been discussed in this section.

A. Experimental Analysis of System
In this paper, we have worked with carrot disease. Therefore, six types of different data have been selected for the proposed system. Five of them are carrot diseases, and the other one is healthy carrot. After collecting different training and validation datasets, training data is fed in the final model to train the system, and validation data is fed to check its accuracy. A callback function is applied to save the best model with the highest accuracy. We have experimented with VGG16, VGG19, MobileNet, and Inception v3 models. The whole process has been implemented in Python, TensorFlow, and Keras. The epoch wise accuracy and loss for each of the models can be visualized in Fig. 5, Fig. 6, Fig. 7, and Fig. 8 respectively. It can be observed from Fig. 5, Fig. 6, Fig. 7   matrix has been formed for each of the models. The confusion matrix generated from the models are shown in Fig. 9, Fig. 10, Fig. 11 and Fig. 12 respectively. The accuracy rate calculated from the confusion matrices in Fig. 9, Fig. 10 and Fig. 11 for the models VGG16, VGG19 and MobileNet is found to be 86.6%, 86.2% and 89.2% respectively. From the confusion matrix in Fig. 12, the accuracy of the proposed system based on Inception v3 model is found as high as 97.4%.

B. Model-wise Result Analysis
Before selecting Inception v3 as the model for transfer learning, few more state-of-the-art pretrained models have been experimented. VGG16 and VGG19 pretrained models provide us with 86.6% and 86.2% accuracy respectively. Nevertheless, we expected a more accurate system. Therefore, the MobileNet CNN model has been experimented which provides an accuracy of 89.2%. At last, the Inception v3 pretrained model has been applied, which provides the highest accuracy of 97.4%, as mentioned previously. Table II, Table III, Table IV and Table V shows the performance for each of the diseases of the models VGG16, VGG19, MobileNet, and Inception v3 respectively. All the performance metrics are calculated using the equations    3, 4, 5, 6, 7, and 8 respectively and the results are shown in Table II, Table III, Table IV and Table V.
From the disease wise achieved performance for the model VGG16 in Table II, it can be observed that the healthy class gives the highest accuracy of 97.4% and the class black rot gives the lowest accuracy of 93.9%. The highest FPR is 3.5% for the class black rot, and the lowest FPR is 1.5% for the healthy class. Table III summarizes the disease wise performance for the VGG19 model, and it can be seen that the healthy, growth crack, and sclerotinia rot classes give the highest accuracy rate of 96.1% and the class black rot gives the lowest accuracy of 93.5%. In this case, the highest FPR is 5% for the class root knot and the lowest FPR is 1% for the healthy class. Table IV gives an outline of the disease wise performance for the MobileNet model in which the highest accuracy of 99.1% is found for the class black rot and the lowest accuracy rate for the class growth crack of 94.4%. The highest FPR is found to be 3% for the class root knot and the lowest FPR is 0.5% for the black rot.
It can be seen from the disease wise performance of the Inception v3 model in Table V that the class sclerotinia rot achieves the highest accuracy of 99.5% while the class root knot gives the lowest accuracy of 98.7%. The highest FPR is 1% for the classes growth crack and root knot, and the lowest FPR is found 0% for the healthy and sclerotinia rot classes, which justifies the robustness of the proposed methodology.
From the comparison of these tables, it can be noted that the model Inception v3 gives the highest overall system accuracy of 97.4% and the model VGG19 gives the lowest accuracy rate of 86.2%. Fig. 13 shows the overall comparison of accuracy, precision, recall, and F1 score among the four models on the dataset. The accuracy rate achieved mathematically justifies the validation accuracy graphs of the respective models in Fig. 5, Fig. 6, Fig. 7, and Fig. 8.

C. Comparative Analysis
The study of literature demonstrates that most of the papers have applied traditional machine learning methods. Very few

VI. CONCLUSION AND FUTURE WORK
In this research, we have compared four different CNN architectures, i.e., VGG16, VGG19, MobileNet, and Inception v3 to identify and classify five major carrot diseases, namely Aster yellow, Black rot, Growth crack, Root knot, and Sclerotinia rot. Observing the experimental results of all four models, the Inception v3 model has shown its worth in detecting and classifying carrot diseases, assuring 97.4% accuracy. The methodology stated in this paper will help the farmers to identify carrot disease and take real-time actions. We hope that it would promote farmers worldwide and agricultural experts to start smart farming which will save their time and reduce economic loss as well as achieve sustainable agriculture.
In the future, we can extend this research by increasing the number of identified diseases with a larger dataset and experiment with various other classification methods. We wish to work on identifying disease severity as well. As IoT and mobile technology are gaining popularity among the general mass, we plan to create a web-based application as well as a multilingual mobile application implementing the proposed method wherein the rural farmers can be benefited from realtime solutions.