Optimal Training Ensemble of Classifiers for Classification of Rice Leaf Disease

—Rice is one of the most extensively cultivated crops in India. Leaf diseases can have a significant impact on the productivity and quality of a rice crop. Since it has a direct impact on the economy and food security, the detection of rice leaf diseases is the most important factor. The most prevalent diseases affecting rice leaves are leaf blast, brown spots, and hispa. To address this issue, this research builds a new classification model for rice leaf diseases. The model begins with a preprocessing step that employs the Median Filter (MF) process. Improved BIRCH is then utilized for picture segmentation. Features such as LBP, GLCM, color, shape, and modified Median Binary Pattern (MBP) are retrieved from segmented images. Then, an ensemble of three classification models, including Bi-GRU, Convolutional Neural Network (CNN), and Deep Maxout (DMN) is utilized. By adjusting the model weights, the suggested Opposition Learning Integrated Hybrid Feedback Artificial and Butterfly algorithm (OLIHFA-BA) will train the model to improve the performance of the proposed work.

Not only are rice leaf diseases prevalent in India, but they are also prevalent in other countries. There are several types of leaf diseases, such as brown spot, tungro, bacterial blight, blast, etc. Farmers have no control over these infections. Consequently, visual examination or laboratory tests are used to detect illnesses on leaves [16] [17] [18]. Visual analysis of this issue is time-consuming for a specialist. In addition, when chemical reagents are necessary, the experimental procedure becomes more difficult.
Certain strategies are employed to simplify these matters. The Deep Learning (DL) algorithm is applicable to agricultural problems such as root segment, fruit count, seed selection, disease classification, etc. [19] [20] [21]. DL algorithms are sophisticated versions of ML for detecting crop infections. With this method, the inputs were automatically learned, and the output was generated based on the decision criteria. CNN technology was utilized during the development of the visual image. In addition, "Rice Doctor" and "Rice Xpert" smart phone applications for farmers were introduced using the internet and mobile technology. The "Rice Doctor" app serves as a questionnaire for farmers [22] [23] [24] [25].
This study considered the most prevalent rice leaf diseases, including brown spot, leaf blast, and bacterial blight. The CNN model was calibrated to improve its accuracy. It is exceptionally accurate. Only the disorders were treated with the tuned model. We have to implement advanced DL algorithms [26] [27] [28] [29] due to the need to identify different forms of rice leaf disease and raise the degree of accuracy.
The contributions are detailed as follows:  Proposed a new classification model for rice diseases with enhanced BIRCH-based segmentation.
 Utilizes an ensemble model based on OLIHFA-BA with a defined feature set consisting of enhanced MBP features, Local Binary Pattern (LBP), Gray Scale Co-Occurrence Matrix (GLCM), color, and shape features.
The structure of the paper is as follows: Section II describes standard works. Section III describes the adopted phases of the suggested classification strategy, whereas Section IV describes characteristics. Section V displays optimised ensemble classifiers, while Section VI depicts an assisted OLIHFA-BA optimization algorithm. Results and conclusions are provided in Sections VII and VIII.

A. Related Works
In 2021, Krishnamoorthy N et al. [1] conducted study on rice leaf disease classification. Moreover, fifty percent of the world's population consumes rice. Therefore, rice is the world's principal source of energy. Rice plant diseases, which are caused by viruses, bacteria, fertile soil, pests, temperature fluctuations, and so on, are the most significant obstacles in rice cultivation. Finding and treating rice plant illnesses is a challenging undertaking for farmers. In this investigation, sickness identification was performed using a DL approach. CNNs were used for object segmentation, image classification, and image analysis in Deep Learning. The  Kumar Sethy et al. [2] looked at how to find diseases on rice leaves in 2020. This study talks about the four diseases that can hurt rice leaves: bacteria blight, blast, tungro, and brown spot. DL technology is used to find out if someone is sick. In this study, TL with CNN models, deep features, and SVM were used to figure out how to group things. Two conclusions were similar, but the deeper features of SVM performed much better.
Chen et al. [3] conducted rice plant disease detection research in 2021. Using CNN models constructed with the DL approach presented various technical challenges, such as picture identification and classification. In this work, however, MobileNet-V2 was employed, and numerous techniques were employed to assess the significance of spatial sites for input characteristics and inter-channel interaction. Using DL-based CNN approaches, most of the technical challenges associated with picture recognition and classification have been resolved. Transfer learning and the enhanced loss function were repeated on two separate occasions. The public dataset [48] utilized for this investigation was 99.67% accurate. In difficult research environments, the accuracy of rice plant disease diagnosis was determined to be 98.48%. Therefore, the suggested method was more efficient at identifying rice plant diseases.
Madhavi and M.A. Saleem [4] studied the identification of rice leaf diseases in 2021. To preserve the growth of the agricultural sector, the first step of plant disease identification was taken. The comparison between automatic and manual plant monitoring is required and beneficial. CNN is frequently used for this kind of categorization. It does a good job of classifying and diagnosing plant diseases by utilizing highly accurate data collected from a range of sources.
Jjang et al. [5] studied the leaf diseases of wheat and rice plants in 2021. To reduce plant growth loss, this issue would be diagnosed quickly and precisely. In this study, the Image NET pre-training model, alternating learning, and the VGG16 implementation were used to facilitate multitask learning, TL, and recognition. This model's accuracy for rice plants was 98% and for wheat plants it was 99%. This model has proved the excellent performance of the VGG16 model and the multitask TL, which is accurate in recognizing plant diseases.
Jiang et al. [6] conducted a study on image-based disease diagnosis in rice leaf images using Support Vector Machine (SVM) and DL in 2020. Combining these two approaches has helped to effectively address the issue while also improving precision. The authors of this study have utilized CNN to derive the images of the relevant leaf. During the last round of the evaluation, the classic Back Propagation Neural Network (BPNN) models were compared to the more accurate SVM models.
Zhang et al. [7] conducted study in 2020 utilizing spectral image technology to identify leaf illnesses in rice crops. This method was utilized to determine the severity of rice leaf explosions. In this work, a hyperspectral imaging method was used to distinguish between images of afflicted and healthy leaves. The data was then reconstructed using the SRR method. This model's precision was approximately 98%.
In 2021, Bakade et al. [8] conducted a study on preventing bacterial illness in rice leaves. Xoo is the primary cause of this problem. This investigation revealed the interconnected actions of genes and plant immune pathways, which could be leveraged to develop resistant rice cultivars.

B. Review
Once, the only means for diagnosing a disease was a manual analysis of the leaf. This was accomplished manually by examining plant leaves or consulting a book to identify the disease [5]. This method has three major drawbacks: it is imprecise, it cannot study every leaf, and it is time-consuming. Several approaches for effectively identifying these ailments have been created because of the advancement of science and technology. Image processing and deep learning are methodologies. Image processing includes a range of techniques, including filtering, clustering, histogram analysis, and image processing algorithms, to discover damaged areas and diagnose diseases. In contrast, DLNN are used to identify diseases. There are two principal causes of plant diseases. The first is a bacterial or fungal attack, and the second is an unanticipated shift in the weather [6].
When addressing rice infections, we must consider a few critical elements. Collecting samples from a damaged rice plant is one of the crucial and critical duties. To do this, multimedia sensors may be deployed across the farm. This permits routine monitoring of rice plants. Additionally, the effects of climate change on rice plants can be monitored and studied. This approach has several disadvantages, including the necessity for frequent system maintenance and low precision due to shadows in the gathered photos. It is crucial to accurately identify rice infections in order to avert the disease's devastating effects on crop productivity. However, the present methods for diagnosing illnesses in rice are neither exact nor effective, necessitating the need of supplementary equipment.

III. ADOPTED PHASES IN PROPOSED CLASSIFICATION APPROACH
The following are the adopted phases of the proposed rice leaf disease classification:  In the very first phase, the input image is submitted to MF for the aim of pre-processing. Then, Improved BIRCH is implemented to segment the images.
 The LBP, GLCM, colour, and improved MBP-based feature set is derived from segmented images. Then, ensemble model-based classification is performed with three classifiers, including Bi-GRU, CNN, and DMN as shown in Fig. 1.
 The final classification results are determined by the combined averaged outcome.

A. Pre-processing
In this research, median filtering is utilised to pre-process the image input.

MF [30]:
The median filter is a common non-linear digital filtering technique used to remove noise from a picture or signal. This type of noise reduction is a typical pre-processing technique used to improve the results of future processing (edge recognition in image). MF is widely utilised in the processing of digital photos, and in certain instances it preserves edges while lowering noise.
The MF oriented pre-processed image is designated as Following image processing, improved BIRCH is utilized for image segmentation.

B. Modified BIRCH Model
Clustered features [31] store the information necessary for data grouping and provide a concise description of a set of points in feature space. Consider the set of points in the dimension [31].
Conservatively, the size, of a set is described as the average distance between two points, as shown in Eq. (1).
As per improved BIRCH, Di is modeled as shown in Eq.
The representation of clustering feature, cf , of set X is specified in Eq. (3).
Here, M points out points in X ,  As per enhanced BIRCH, c s is computed as exposed in Eq. (5), in which, corr points out correlation that is modeled as in Eq. (6).
The improved BIRCH image is denoted as IBIRCH ig .
IV. EXTRACTING LBP, GLCM, COLOR, SHAPE AND IMPROVED MBP FEATURES From IBIRCH ig , the feature set including LBP, GLCM, color, shape and improved MBP are extracted.

A. Shape Features
The primary source of information used for object identification is shape [32]. Without shape, a visual item cannot be effectively recognised. Without understanding shape, an image is incomplete. Although the two items cannot have the exact same shape, we may identify comparable shapes by utilising a variety of techniques. Triangle, Circle, Rectangle, Square, Oval, and Diamond are some of the available shapes. The features of the extracted shapes are indicated by Sh fs .

B. Colour Features
Colour space characterizes colour in the type of intensity value [32]. By employing the colour space approach, we can define, see, and produce colour. The colour histogram shows the image from various angles. The colour histogram used to describe the frequency distribution of colours in the image counts and stores related pixels. Every statistical colour frequency in an image is examined using the colour histogram. The colour histogram not only focuses on specific areas of an image, but also solves difficulties with translation, rotation, and angle of view changes. The local colour histogram is simple to calculate and robust to minute image fluctuations, making it crucial for the retrieval and indexing of image databases.

C. GLCM Features
GLCM is employed to evaluate the spatial association among the pixel [33]. The constraints in GLCM are given in Table I.

D. Modified MBP Features
The MBP [34] attempts to determine the LBP by thresholding pixels with a median value above the threshold. In this level of filtering, the centre pixel is evaluated, providing 29 potential structures. MBP is conventionally represented using Eq. (7).
As per modified MBP, it is modelled as in Eq. (8). Here, we point out weight function that is evaluated by means of cubic chaotic map.
is modeled based upon median as in Eq. (9).

E. LBP Features
In a variety of comparison analyses, the patterns of LBP [35] are provided with a high level of discrimination and simplicity. The fundamental LBP is used to derive the differential features between a certain reference pixel and its neighbours with radius. The resultant LBP for a pixel is given by Eq. (12), where is the geometric mean of nearby pixels and represents the grey values of the middle pixel and surrounding pixels with radius.

V. OPTIMIZED BI-GRU, CNN AND DMN MODELS
The derived fs is then given as input to three classifiers such as Bi-GRU, CNN and DMN.

A. Bi-GRU
It [36] is a sort of Recurrent Neural Network (RNN) that facilitates the handling of data from successive and previous time steps in order to provide output predictions based on the present state. Eq. (13) to (16) expose the BI-GRU calculation by displaying the sigmoid function, hidden, and input vectors as, and respectively. The reset data is represented as, weight, time interval, and the condition of the cell at the previous time stamp.
The butterfly mating and feeding behaviors inspired BOA. The distinctive characteristic of BOA represented by Eq. (24) is the fragrances with diverse aromas in butterflies, where the power exponent (between 0 and 1) relates to the index of sensor modality and indicates stimulus magnitude.
Butterflies are capable of accurately locating scent sources and share this knowledge with one another. There were three stages of OLIHFA-BA, and they were as follows: 1) Initialization: The constraints and objectives would all be initialized. Additionally, chaotic-based OBL is produced according to the model in Eq. (25), which  refers to chaotic map function that uses a tent map.
2) Iteration: Both local and international searches were performed.
The traditional numerical depiction for butterfly global search movement is in Eq. (26), herein, t points out iteration, t i y points out th i butterfly position at t , i q points out scent of th i butterfly and best G points out finest position of global butterfly.
In OLIHFA-BA model, it is formulated depending upon FAT's branch update as in Eq. (27), in which, d points out smaller constant, ( d = 0.382) and   1 , 0 rand points out random numeral. Moreover, we use FAT based crossover operation to generate new branches.
Numerous typical events might hinder butterfly movement and fragrance dispersal. Local search is simulated by the butterfly positions according to Eq. (28).
3) Termination: OLIHFA-BA terminates once the utmost iterations were arrived. The OLIHFA-BA model is shown in Algorithm 1.  Fig. 3 displays the illustration of the sample image.

B. Performance Analysis
The performances of the proposed EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA over existing met heuristic models are calculated and displayed in Fig. 4 to 6 for various measures. The EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA model is compared to the EC + DHO, EC + SMO, EC + CMBO, EC + BMO, and EC + MFO models for several LPs.  [45], TL-DCNN [46], and LMBWO [47]. The proposed EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA model has produced superior results compared to differentiated approaches for optimization and classification models. The improved prediction rate should result in slight negative results and bigger positive values. As seen in Fig. 4, the outputs of the EC model for all positive metrics grow, whilst the outputs for negative metrics decrease. Specifically for current and prospective schemes, the 50th LP yielded the best results. The proposed system at the 50th LP achieved the highest accuracy values (0.96), whereas other schemes achieved low precision levels. This improvement in classifier analysis and optimization analysis by EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA is mostly due to the incorporation of enhancements in features, segmentation, and classifiers.     Fig. 7 shows the completed cost analysis. EC (CNN, Bi-GRU, and DMN) is evaluated over EC + DHO, EC + SMO, EC + CMBO, EC + BMO, and EC + MFO. In view of Fig. 7, the cost of EC (Bi-GRU, CNN, and DMN) has increased little. From iterations 0 through 5, the CMBO model achieves a high cost of 1.095. In addition, between iterations 0 and 5, the cost of EC (Bi-GRU, CNN, and DMN) is approximately 1.084%. Thus, EC (Bi-GRU, CNN, and DMN) was able to obtain BIRCH at a reduced cost and with expanded features.

D. Conventional VS. Proposed Analysis
Table III compares the adopted EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA scheme to EC without OLIHFA-BA, EC with traditional BIRCH, and EC with traditional MBP. Observing the Table, the recommended EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA had superior values than the EC without OLIHFA-BA, the EC with conventional BIRCH, and the EC with conventional MBP. This demonstrates the effect of BIRCH enhancements and hybrid optimization.

E. Analysis on DICE and Jaccard Scores
Table IV provides a study of dice scores and Jaccard scores. The dice and Jaccard scores for suggested BIRCH are greater than those for FCM, K-mean, and conventional BIRCH.  We have developed a novel classification technique for rice leaf diseases in which the image was preprocessed using MF. The image was then segmented using BIRCH with enhancements. LBP, GLCM, colour, shape, and enhanced MBP-based features were recovered from segmented images. The data was then classified using three classifiers: Bi-GRU, CNN, and DMN. The results of the Bi-GRU, CNN, and DMN systems were averaged to determine the results. In addition, the Bi-GRU, CNN, and DMN schemes' weights were determined optimally using the OLIHFA-BA method. The 50th LP proposal produced the highest accuracy values (0.96), but other systems achieved low precision levels. This improvement in classifier analysis and optimization analysis employing EC (Bi-GRU, CNN, and DMN) + OLIHFA-BA is mostly attributable to the incorporation of enhanced features, segmentation, and classifiers. In the future, illness types should be analyzed.
ACKNOWLEDGMENT I would like to express my deep gratitude to Professor Dr. K Kiran Kumar, my research supervisor, for his patient guidance, enthusiastic encouragement and useful critiques of this research work. I'd also want to thank the technicians at department's laboratory for their assistance in providing me with the resources I needed to run the application.