An Intelligent Approach for Detecting Palm Trees Diseases using Image Processing and Machine Learning

—Today’s palm trees diseases which cause a huge loss in production are extremely hard to detect either because these diseases are hidden inside the texture of the palm itself and cannot be seen by naked eyes or because it appears on its leaves which are hardly examined due to how far they really are from the ground. In this paper we’re interested in detecting three of the most common diseases threatening palms today, Leaf Spots, Blight Spots and Red Palm Weevil. Diagnosis of these diseases are done by capturing normal and thermal images of palm trees then, image processing techniques were applied to the acquired images. Two classifiers were used, CNN to differentiate between Leaf Spots and Blight Spots diseases and SVM for Red Palm Weevil pest. The results for CNN and SVM algorithms showed a success rate of accuracy ratio 97.9% and 92.8% respectively, these results are considered to be the best results in this domain as far as we know. The paper also includes the first gathered thermal images dataset for palms infected with Red Palm Weevil and healthy palms as well.


I. INTRODUCTION
There are more than 2,500 palm tree species which can produce over 1000 products [1]. This is why they're considered to be one of the most important trees all over the world. Palm trees play a crucial role in the agricultural economy of most of the country's int the Middle East as they are the top largest date producers in the world [2]. Top 10 Largest Date Producers in the World is shown in Table I. However, the quality and quantity of the palm trees are always at risk due to different types of diseases. Palm trees can be infected by two popular diseases such as Leaf Spots and Leaf Blights, these diseases vary in shape and size and are widely spread in palm trees farms leading to the essential need of investigating them [3].
Leaf spots symptoms appeared as small spread, irregular, brown to black which varies in size about 3-7 mm and appears on the upper and lower surface of rachis and fronds. [4]. Leaf blight symptoms appeared as elongated brown to black spots which enlarged on wide area causing cankers on the midrib [3]. Both symptoms of leaf spots and blight spots diseases are nearly the same and both of them appear on the palm tree leaves thus, detection of these disease are very challenging. One of the most important approaches to detect palm leaf diseases is the naked eye observation of experts which is expensive and requires continuous monitoring and will consume a lot of time especially in large farms. Another serious risk is a lethal pest called red palm weevil. Red Palm Weevil has been discovered in more than fifty countries which is reported to be the most harmful and destructive pest for palms [5] incurring a significant amount of economic loss [6]. Also, the yearly damages of approximately US $26 million was estimated in the Middle East plantations of date palm that was caused by RPW alone [7]. This pest develops deep inside the palm, hides in its texture and cannot be seen by the naked eye, destroying the vascular system of the palm and eventually leading to its death [8]. Recently In 2018, A new approach was invented to detect the red palm weevil by sensitive sensors but according to Dr. Kareem Shaarawy an agriculture engineer in Palm Research Center in Egypt; it is hard to get these sensors because they are expensive(about 1841 USD) and using them is complicated because detection of RPW in each palm tree requires the user to wait for 1 minute while holding the sensor to take the vibration readings inside the palm tree itself, not to mention the damage resulting from using such approach to the palm as it leaves a hole which can attract more insects and pests later on. In this paper, we provide an android mobile application with a Real-time detection of the *Corresponding Author common diseases mentioned before by mobile cameras and also Red Palm Weevil by acquiring thermal images of palm trees using thermal camera connected to smartphones. These images will be enhanced, then machine learning techniques will be applied to them in order to early detect these diseases before the palm reaches an untreatable state and without damaging the palm tree.
In this work, a novel intelligent method is proposed to detect three of the most common diseases threatening palms today. These diseases are namely, leaf spots, blight spots and red palm weevil (RPW). Diagnosis of these diseases are carried out by capturing normal and thermal images of palm trees. Then, image processing techniques were applied to the acquired images. Then CNN classifier is used to differentiate between leaf spots and blight spots diseases and SVM for detecting RPW pest. The results for CNN and SVM algorithms showed accuracy ratio 97.9% and 92.8%, respectively.
According to our knowledge, there are currently scarce papers to tackle this issue using machine and deep learning techniques. Also, according to our knowledge, these results are the best results in this field so far.
The rest of this paper is organized as follows: Section II presents some related works in this domain. Section III will justify both CNN and SVM algorithm which are used in the proposed system. Section IV is devoted to introduce the proposed model. Section V will show the experiments done and their outcome results. Finally, we summarize the conclusion of the paper in Section VI.

II. RELATED WORK
This section explains the literature review that is concerned with the same domain. Related works are divided into two subsections. One for detection of leaf spots and blight spots disease and the other for detection of RPW.

A. Detection of Plants Leaves Diseases
In [9], the author aimed to detect leaf blight disease in tomatoes leaves by using CNN and with LVQ classification Algorithms. The bacterial spot, late blight, septoria leaf spot and yellow curved leaf diseases are the classes used in this experiment. The dataset consists of 400 training and 100 test tomato leaf images used from a plant village dataset. The images in the selected dataset have been cropped to the size of 512x512. To achieve better results, different color components were used instead of using a single one. 20 images for every classes were used to test the model including a healthy class, a few numbers of the images were inaccurately classified, four of them are for leaf diseases and one of them is for healthy leaves. This model achieved an accuracy of 86%. One of the main challenges in disease detection and classification for this study is that the leaves are infected with different diseases which are very similar to each other.
In [10], a real-time detection of brown spot apple leaf disease using deep learning approach is presented. Alternaria leaf spot, Brown spot, Mosaic, Grey spot, and Rust are five common types of apple leaf diseases are used as classes in this study. A new apple leaf disease detection model was created using deep-CNN algorithm. Moreover, a new deep-learningbased approach, namely, INAR-SSD was used and implemented by Caffe framework on the GPU platform using a dataset of 26,377 images of diseased leaves. For the image annotation, an algorithm that provides a frame selection function was used along with the knowledge provided by experts in the field of agriculture so that the diseased areas of an image were successfully selected and labeled with the corresponding classes. The comprehensive detection performance reaches 78.80% mAP and speed of the model reaches 23.13 FPS.
In [11], Chimaera and Anthracnose diseases in palm oil tree were detected using image processing techniques. The symptoms of Chimaera disease are confined to the palms' leaves having white or yellowish-white stripe, and the lack of chlorophyll in them. The severity of the Anthracnose disease lies in the possibility of affecting all palm oil trees at any of their growth stages. Images were acquired using a digital camera then processed using matlab. The images' intensity values or colour map were adjusted as part of the image enhancement phase. Adding to this, the image segmentation which consisted of a colourbased segmentation using k-means clustering by converting the image from RGB to L*a*b colour. Eventually feature extraction process took place and gyrocomatrix was used to create the graylevel cooccurrence matrix (GLCM) from the image, by using this technique the image texture and colour were considered to come out with the features that represented the image. By going through these processes, the presence of diseases on the palm oil leaf was successfully identified. Proving the success of the support vector machine (SVM) classifier in such cases showing an accuracy of 97% for Chimaera and 95% for Anthracnose.

B. Detection of RPW using Thermal Imaging
Thermal Imaging is known to be used in reporting heat exchange of plant leaf, their water stress and their transpiration rate [12]. Accordingly, it is recommended to be used in detecting RPW which destroys the vascular system of the palm trees, creates water stress and affects canopy temperature [13,14].
In [15], an uncooled infrared thermal camera attached with a microbolometer sensor was used to capture the images. Six experiments were carried out, including Canary and date palm trees. Each experiment contained 4-5 duplicates of control trees and 8-10 duplicates of infested trees. ThermaCAM Researcher software was used to process the collected images. This software was provided with the local environmental conditions and the leaf emissivity in order to produce reliable leaf temperature maps. Furthermore, Crop Water Stress Index (CWSI) was calculated and used to assess the water stress induced by RPW inside the palm trees [16]. It was then noticed that infected trees showed higher temperature rates and higher CWSI values than controlled ones. From eight known infected trees, six were correctly identified as infected (accuracy of 75 %).
In [17] an uncooled infrared thermal camera and a microbolometer attached to it was used to capture aerial images. The flight height was 760-770m above the ground. The goal of their experiment was to distinguish palm trees from soil based on their water stress and temperature rates. Three date palms were included in the experiment and are 435 | P a g e www.ijacsa.thesai.org irritated with known quantity. Using image processing, they did split the palm canopy from soil and a watershed algorithm was applied to the images in order to outline the palm canopy [18]. Since temperature of the canopy is lower than that of the soil, Nmax pixels was determined and according to a temperature threshold the canopy was detected, this temperature threshold was determined by Otsu method [19]. The study proves that it is possible to differentiate between water stress of different palm trees, which can be useful to distinguish between healthy palm trees and those infected by RPW.

A. Convolutional Neural Network(CNN)
CNN is based on neural network of the human brain and consist of multiple layers. The first layer is the convolutional layer, also called the input layer and is described by equation (1) as follows: Input layer consists of filters that are optimized according to the problem that's being solved. For each convolutional layer, there are multiple kernels stacked on top of each other, this is what forms the 3-dimensional matrix for each kernel, we have its respective bias described by equation (2) as follows: Thereafter, an output for this layer, the green matrix in Fig. 1, which has dimensions [h * w *d]. Where d is the depth and h is the height and w is the width of the matrix respectively. The filters depth change and for each position of the kernel on the image, each number on the kernel gets multiplied with the corresponding number from the input matrix and then all are summed up for the value in the corresponding position in the output matrix with the depth of the input matrix >1, the same applies for each of the channels and then they are added up together with the bias of the respective filter as well and this forms the value in corresponding position of the output matrix shaping the depth of the output layer. This entire process is repeated and it's described by the equation (3) as follows: with all the d2 kernels which forms the d2 channels in the output layer which is described by equation (4) as follows: Another layer is the poling layer which has two types; max pooling and average pooling. The main purpose of a pooling layer is to reduce the number of parameters of the input tensor and thus, helps reduce overfitting [20]. Extracting representative features from the input tensor and reducing computation is also one of its utilizations and thus aids efficiency in max pooling case. The layer moves according to the kernel size choosing the maximum value in the matrix. On the other hand, the average pooling chooses the average value then all of the CNN is connected with a flatten layer which connects the convolutional layers with the dense layer (output layer) and usually the network stop learning when it gets the optimal weights [21] which is outlined by equation (5) as follows:

B. Support Vector Machine (SVM)
SVM is a supervised learning model which can be used to solve classification and regression that makes it efficient in solving problems in general. Classification in the SVM is done by creating a divider between classes the divider is called a hyperplane also known as decision Boundary which helps the SVM to take decisions. The reason that makes the SVM popular is its use in kernelization [22] and because it is versatile. It can also work for multiple classes and it can work for multi linear problems too. We can get the hyperplane of Fig. 2 using the equation (6) after calculating multiple decision boundary and choosing the right one then we can classify [23]. as follows:  IV. THE PROPOSED MODEL Fig. 3 shows the flow chart of the proposed model. The model starts with the image acquisition phase. In this phase images are acquired either using normal mobile camera for leaves diseases like leaf spots and blight spots or by using thermal camera for detecting RPW inside palms trunk. In the input phase, The Acquired images is enhanced then features are extracted so the model can train and classify disease in the output phase then finally showing the results to the user.

A. Data Acquisition
Since we're dealing with two different types of diseases, one is leaf-based and the other is pest-based and because of their different features, separate datasets were constructed as follows: 436 | P a g e www.ijacsa.thesai.org The dataset is acquired from a well-known website (Kaggle), it contains a total of 91,360 images for leaf spots and blight spots diseases. Using the dataset as is resulted in an unsatisfying performance of our model because of the low variation of the images. Therefore, the augmented data set was analyzed to get the original images before augmentation leading to the selection of 35 leaf spots images and another 40 images for blight spots adding to this 50 more palm images that we collected for each disease using a Samsung Galaxy A50 mobile camera of 25Mp sensor with 26mm-equivalent f/1.7 lens. Moreover, since the original images of the Kaggle's dataset were too big in which they contained too much unwanted information that caused a downgrade in the image quality when resizing to our 224*224 input image size, the regions of interest (infected parts) were manually cropped to a certain scale that is divisible by 224*224 in order to preserve as much information as possible in the image. Afterwards, the images were resized to our input size and image augmentation techniques which are rotation, flipping Fig. 4, Fig. 5, and adjusted di fferent brightness Fig  . 6, Fig. 7 values were applied resulting in 5250 images for each disease. Another challenge were tackled right after applying rotation on images, images were having a black background that caused some sort of noise and also extra unwanted signals, so the black background was removed and converted to a transparent one by adding an alpha channel to remove the noise Fig. 8, Fig. 9.

2) Red Palm Weevil dataset (Pest-based):
A thermal images dataset was built and classified into two classes; healthy palms and palms infected with RPW. The healthy class contains 16 images and the infected class contains 24 images which were acquired at different times (12 pm and 10 pm) from Palm tree from Palm Research Center in Giza government in Egypt with a Testo 890-2 thermal camera. Due to COVID-19 pandemic, it was hard to gather more images, therefore augmentation techniques were used to increase the number of images to 1200 images for each class.

B. Feature Extraction
This part is mainly dedicated to the SVM model which predicts whether the palm is infected by Red Palm Weevil or not from their thermal images. Since infrared thermography offers a digitized thermal distribution called thermograph, the analysis of thermal images becomes more beneficial after extracting signals and information stored in them. Thermal images contain two types of features which are textural and statistical features. Texture feature measure the relationship among the pixels in local area, reflecting the changes of image gray levels [24]. This is why the images were converted to grey scale so that their Gray-Level Co-Occurrence Matrix (GLCM) properties can be used in feature extraction. In order to mimic the way of feature extraction used in deep learning models like CNN; back propagation and variation of filters were used to put all the possible combinations of features to get the most out of the image and get the best out of the model.

C. Classifications
CNN and SVM classifiers were selected for detecting common diseases and RPW infected palms respectively as they are considered to be the top classifiers known to accomplish such work efficiently [25], [26], [27]. CNN was preferred to be used in the common diseases case due to the size of our dataset described in Section IV-B. Similarly, SVM was used in identifying RPW infection because the dataset was too small 437 | P a g e www.ijacsa.thesai.org [28]. Proposed CNN model is built on pre-structured VGG16 Network that's known for its well measured layers and hyperparameters and ensures efficient feature extraction and learning processes. Customization was made in some layers in the VGG16 architecture to fit our common diseases classification case by adding two call backs to the model; early stopping, and model check point. Early Stopping call back was to help the model stop when no more convergence happens in training and model check point needed to save the best model in the training process. VGG16 showed better results than our planned CNN model as the structure of VGG16 was more suitable than our planned CNN model, which shown in Table II. Also, some features were added to get the mean of the image, contrast between image pixels and standard deviation inside a matrix that makes it computationally inexpensive, efficiently extracting features and predicting the outcomes in a maximum accuracy depending on the variation and the size of the given dataset.

V. RESULTS AND DISCUSSIONS
Kaggle's dataset was used as is (more than 90k images), leading to an achieved accuracy of 65% by the first CNN model. Dataset was divided into 70% for training and 30% for testing. The results of this model regarding blight spot disease showed overfitting. This affected the model by reducing its predictive power not forgetting the disappointing results when it comes to the detection of leaf spots disease Fig. 10 and Table III. The results were improved in the second CNN model after applying the image enhancements and by providing the dataset with a precisely chosen group of images for both diseases. The second model successfully achieved an accuracy of 97.9% and a prediction hit ratio of 9 to 10 on just about 10.5k images (80% training and 20% testing) solving the overfitting problem for blight spot disease and improving leaf spots detection results significantly Fig. 11 and Table IV. On the other hand, our SVM model is built upon scikitlearn library, LinearSVC algorithm was preferred because it is the best one to fit our case due to its efficient way of classifying [27]. This worked well with an accuracy of 93% and a prediction hit ratio of 18 out of 18 healthy images, and 4 out of 5 infected images.   a) Experiment (1): Objective: proving the CNN model's ability to detect leaf spots and blight spots diseases at different circumstances. Therefore, normal mobile camera was used in collecting images during day time, with no filter added but, at night the camera's flash light was used. Adding to this, that these images were collected at different distances from the palm.
Setup: images were captured by two different mobile cameras. For leaf spots, Samsung galaxy S9 camera was used and for blight spots, Huawei nova 3e camera was used. The experiment includes two different palms; One palm is infected with leaf spots disease shown in Fig. 12 and the other palm is infected with blight spots disease shown in Fig. 13. A total of 8 images were acquired. For each palm, four images were captured, two of them at daytime(1pm-2pm) with no filter added and the other two at night(9pm-10pm) using the camera's flash light. The difference between the two images at each time period is the distance from the palm, as one of the images were closer to the palm than the other.    CNN model showed successful results by identifying all of the given 8 images correctly. This experiment shows that the model can detect Leaf and blight spots diseases not only at day time but also at night using flash camera and at different distances from the palm. b) Experiment (2): Objective: RPW changes the temperature of the palm tree which is why we decided to use thermal cameras in the first place and since the radiation of the sun is considered one factor affecting palm's temperature, examining the model at different time periods throughout the day can be deceitful for the SVM model built but is a challenge to prove the success of our suggested model. Thus, at this part we are going to test only healthy palms at day time and also different periods of time in anticipation of any absorption of the sun's heat energy by the palm. Evaluating the SVM model to detect the palms infected with RPW will be clarified in the second part of the experiment.
Setup: For the first part of the experiment, images of healthy palms were captured using FLIR one pro usb portable camera attached to a mobile phone through FLIR ONE application. In Fig. 14, three images were acquired for a palm tree at different time periods (1pm, 5pm, and 9pm). Table VII presents the SVM model results for healthy palm. For the second part, images of infected palms were captured using the same camera but this time the camera is attached to a tablet though using Vernier application. In Fig. 15, three images were acquired for three different palms and were carried out between (12pm and 1pm).   SVM model showed successful results by identifying healthy palms in different time periods despite the heat energy that maybe preserved by the palm and affecting its temperature due to the sun's radiation. The model succeeded in identifying all of the healthy images as well as the palms infected with RPW. This experiment proves that thermal imaging is a reliable method for detecting RPW pests inside palm trees.

VI. CONCLUSION
In this work, image processing and machine learning techniques were applied to develop an application that can detect palm tree common diseases such as leaf spots and blight spots and red palm weevil lethal pest. For leaf spots and blight spots a dataset of total 5250 images were used for each disease. A VGG convolutional neural network algorithm was applied for classification, achieving a success rate of 97.9%. For red palm weevil pest, thermal images were used for infected palm trees. The dataset used was 1200 thermal images for healthy palms and another 1200 thermal images for palm trees infected with RPW. SVM model was built upon scikit-learn library, LinearSVC algorithm was used as the classification algorithm, which achieved a success rate of 92.8%. RPW is attracted to the wounds of the palms if found and since thermal imaging has no side effects on the palms health it is considered one of the best methods used so far.

VII. FUTURE WORK
Thermal imaging can be used continuously to ensure regular monitoring on large number of palms which makes this method cost-effective and can also give satisfactory accuracy and reliability through aerial thermal imaging before visual symptoms of the RPW is observed on the palms canopy, for the meantime we couldn't get a drown or use satellite imagery due to security reasons in our country but we aim in the future to use one of them for aerial imaging. Another promising method for detecting RPW is hyperspectral imaging. Although it is a complex and expensive method but it shows very high accuracy detection as some studies showed [29], [30], [31] and deserves consideration. Also, swarm intelligence algorithms can be used to enhance and optimize both SVM and CNN.
Finally, we aim to collect more images for our dataset images because there are limited number of images available in this research domain.