Effect of Feature Engineering Technique for Determining Vegetation Density

—Vegetation density is one type of information collected from vegetation cover. Vegetation density influences evapotranspiration in terrain, which is essential in assessing how vulnerable peatlands are to fire. The Keetch and Byram Drought Index model, which evaluates peatland fire vulnerability, divides vegetation density into heavily grazed, softly grazed, and un-grazed. Manual approaches for analyzing vegetation density in the field, on the other hand, need a significant amount of resources. Image data acquisition, pre-processing, feature extraction, classification, feature selection, classification, and validation are all computer vision approaches used to solve these problems. Artificial intelligence algorithms and machine learning approaches promise outstanding accuracy in modern computer vision research. However, in the classification process, the impact of feature extraction is critical. Pattern identification at Back Propagation Neural Network (BPNN) is problematic because the feature extraction dimension is excessively complicated. The solution to this problem is using the feature engineering technique to choose the characteristics. This research aims to explore how feature engineering influences the accuracy of results. According to the statistics, implementing the recommended strategy can increase accuracy by 1% and increase kappa by 1.5%. This increase in vegetation density classification accuracy might help detect peatland vulnerability sooner. The novel aspect of this paper is that, after feature extraction, a feature engineering strategy is used in the machine learning classification stage to reduce the number of complex dimensions.


I. INTRODUCTION
The Keetch and Byram Drought Index (KBDI) model uses vegetation cover conditions as one of the parameters to generate the land drought index [1]- [5]. The density of vegetation cover in KBDI peat is classified into three groups: extensively grazed, gently grazed, and un-grazed. To quantify peatland fire vulnerability used the KBDI peat drought index [3]. Peatland fires have occurred in South Kalimantan in recent years. The fires have become a more common occurrence. Whether intended or not, human activities are responsible for 99.9% of land fires. The existence of human intervention is referred to as anthropogenic [6]. Automation can predict land fire susceptibility to anthropogenic influences. Land cover analysis can be automated utilising machinelearning artificial intelligence approaches combined with computer vision techniques. Previous research indicated that using ordinary cameras for automated land cover categorisation could not discern between trees, weeds, and grass. They have not accomplished these difficulties [7]- [9]. Artificial intelligence-based machine learning classification is necessary to distinguish between trees, weeds, and grasses. The classification method consists of a support vector machine (SVM), naive Bayes, and an artificial neural network (ANN) [10]- [13]. According to some of that research, image classification requires feature extraction or the conversion of an image into numerical to be classified using artificial intelligence methods. Features are crucial in the field of image classification and recognition.
On the other hand, previous researchers looked at converting natural colour information (RGB) to more suited colour space and used that to discern between vegetation and non-vegetation. Philipp and Rath [14] distinguish between vegetation and non-vegetation using the Lab, Luv, and HSV colour spaces. However, because the dimensions of feature extraction are too broad and complicated, machine learning techniques perform poorly. By lowering the dimensions of feature extraction, feature engineering can minimise the number of input features [15], [16].
To deal with the challenge of distinguishing between heavily grazed, lightly grazed, and un-grazed. A novel *Corresponding Author.
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 7, 2022 656 | P a g e www.ijacsa.thesai.org technique was developed that combined pre-processing, feature extraction and selection, as well as classification. Segmentation based on distance threshold is employed as pre-processing. We employed grey level concurrent matrix (GLCM) and Backward Elimination to extract and choose the feature. A backpropagation neural network is utilised in the classification approach or analytical methods to distinguish between heavily grazed, lightly grazed, and ungrazed. Based on the approach, a technique for acquiring wetland image data will be developed and tested on pure vegetation stands with a height > 20 m above the canopy [12].
The contribution of this study was validated by comparing the results of feature selection based on the classification method to (1) show the effect of feature selection on classification performance; (2) analyze the relationship between feature selection and image classification performance; (3) determine the parameters needed to achieve the best performance results on the classification model, and (4) present the results of the best classification model for determining vegetation density. This paper's contents are categorized as follows: Previous works on this research are highlighted in Section II. Section III outlines our research strategy. Section IV explains the materials and methods used in our experiment. The experiment design is described in Section V. Following that Section V summarizes the experiment's results and discussion. Finally, in Section VI, we reach a conclusion.

II. RELATED WORK USE
There have been previous land categorisation studies in general land regions [17]- [19], wetlands [20]- [24], and peatland research [25], [26]. The study by Herdiyeni et al. [8] looked at how to identify plant health by utilising characteristics based on the local binary pattern variant (LBPV) approach, morphological features, and colour features and employed a probabilistic neural network (PNN) with a performance of 72.15%.
The bulk of this research employed artificial intelligence to evaluate land cover using remote sensing data, while field validation was rarely performed. For land cover analysis, the most often used artificial intelligence approaches include supervised machine learning, Back-propagation neural network (BPNN) with an accuracy of 97.65%, support vector machine (SVM) with an accuracy of 97.45%, and neural networks (ANN) with an accuracy of 96.95% [17]. The study of Tan et al. [18] compares the performance of the random forest (RF), decision tree (DT), SVM, and ANN techniques to map three typical landscapes. The results show that ANN performs relatively poorly compared to the performance of other methods. Based on this research, further research is needed to improve the ANN method, which is done in this research. In addition, based on Zaldo-Aubanell et al. [19] stated that this research is important and highly preferred, especially when applying data nationally, and has been carried out by many world researchers to solve their domestic problems. This is the reason for the importance of using data that follows the problems faced nationally.

A. Tools and Location Research
The study took place in Block I of the Liang Anggang protected forest in Banjarbaru, South Kalimantan. Kayu Tangi Production Forest Management Unit [27] is in charge of this area. Fig. 2 depicts the research site. Fig. 2(a) research site in data collecting and Fig. 2(b) location distance between Syamsuddin Noor airstrips illustrate the research location. The nearby population has primarily utilised the existing peatland for agricultural land and plants. Peatlands ranging from shallow (100 -200 cm) to extremely deep (> 300 cm) encompass 749.87 hectares (78.43 %). According to Zaldo-Aubanell et al. [19] this data based on a national scale and more important also widely used because it adapts more uses to the regional national scale.

B. Proposed Framework
Land cover picture data from the research location was utilised to create the dataset for this investigation. The data used was 450 images taken from drones 20 meters above the item. At this point, the proposed technique processes variables extracted from feature extraction using the Backward Elimination algorithm. Feature selection picks features that influence the BPNN classification process accuracy. The hidden layer, neuron size, momentum, learning rate, and training cycle are BPNN indicators. Measurements are used to determine whether or not the categorization findings are accurate and optimum ( Fig. 1).

C. Research Design
This research used artificial intelligence based on BPNN as a computer vision methodology to categorise vegetation density. The procedure consists of data capture, pre-processing, segmentation, feature extraction, feature selection, classification, and validation are the steps of classification design. Fig. 3 illustrates the steps.

1) Data acquisition:
With a 900-degree gimbal angle, data was collected using a Mavic Pro drone. The drone is positioned at a distance of 20 meters from the object. The information is separated into three categories: heavily grazed, moderately grazed, and ungraded. The three groups will be separated based on the plant growth at the location of the study. Herbs and shrubs that are lightly grazed make up tree vegetation in heavily grazed areas. On the other hand, ungrazed consists of dry (dead) vegetation, rivers (water), land, and towns. Up to 300 images are collected for use as training data.
2) Pre-processing image: This research focuses on vegetation density, and initial image processing is done using a static threshold value. Then segmentation is done using the green colour of the tree as a threshold. This study used the Euclidean distance (equation 1) approach to colour distancebased segmentation.

3) Feature extraction:
The chosen image is converted into a set of numerical parameters. This numerical parameter is critical for differentiating an item. Using a Grey-Level Cooccurrence Matrix (GLCM), the feature extraction approach is employed to get quantitative values. The likelihood of two grey levels co-occurring is stored in GLCM [28]. The second-order moment or energy (ene), entropy (ent), contrast (con), homogeneity (hom), and correlation are retrieved from the vegetation picture to represent the data cooccurrence matrices indicated in equations (2) to (6) (cor). The distribution of co-occurrence values is designated by k and l with different angles, 0 0 , 45 0 , 90 0 , and 135 0 , at the offset provided (1,1) by p(k,l). Rotation invariant, mean, and variance of orientation-dependent characteristics are computed individually for different angles using different angles.
The equation is used to solve the correlation equation (7).
4) Feature selection: Feature Selection is a machine learning method in which a collection of data features is utilised to train algorithms. Feature selection has become a hot topic in pattern recognition, statistics, and data mining, according to Oded Maimon [29].
One of the essential aspects that might affect classification accuracy is feature selection since if the dataset contains multiple features, the dataset's dimensions will be significant, lowering classification accuracy. The issue with feature selection is dimensionality reduction, as all elements are required initially to achieve maximum accuracy.
According to Maimon [29], there are four fundamental causes for dimension reduction:  Decreasing the learning cost.
 Increasing the learning performance. Because not all features/attributes are relevant to the problem, the fundamental notion of Feature Selection is to choose a subset of existing characteristics without transforming them. Some of these qualities or attributes are even bothersome and diminish accuracy. To increase accuracy, noisy or useless features must be deleted. Furthermore, having many characteristics or qualities will slow down the computation process. Backward elimination starts with the entire collection of characteristics and removes any leftover features from the specified ExampleSet in each round. Performance is calculated www.ijacsa.thesai.org for each element published using the inner operator, such as cross-validation. Only the attributes that cause a modest performance decrease are finally eliminated from consideration. Then a new round with a different selection begins. This method removes the usage of additional memory, the memory used to hold the data, and any memory necessary to perform the inner operator. After the termination requirements are fulfilled, the speculative spin parameter defines how many spins will be made in a row. Elimination will continue if performance improves during the theoretical round. Any extra missing characteristics would be restored if no speculative spin were performed. This process might help avoid the model being stuck on a local optimum.
The difference with forwarding selection is that it starts with an empty attribute and adds any new characteristics from the specified set in each round. The inner operator, such as cross-validation, is used to assess performance for each feature added. Only the most significant performance boost attribute is included in the selection. Then a new round with a different selection begins. [30], data mining is a series of processes to obtain knowledge or patterns from data sets. Data mining solves the problem by analysing the data already in the database. The method of finding a model or function that explains or distinguishes a concept or data class intends to estimate the course of an object whose label is unknown [31].

5) Classification: According to Witten
The classification stage uses the BPNN artificial intelligence method. An artificial neural network (ANN) is a learning algorithm that implements a simple network connected to neurons and units. The performance of ANN is similar to the version of the human brain in recognising a specific pattern. ANN can provide effective classification results even though the input data contains noise and is incomplete [32], [33]. One type of ANN that is often used is Backpropagation. Backpropagation architecture comprises three layers, including the input layer, hidden layer and output layer [32], [33]. Fig. 4 shows a simple visualisation of the Backpropagation structure. Backpropagation formulation can be formulated in equation 8.
By displaying the weight vector, x as the input vector, and b as the bias value, f is the activation function. One hidden layer (10 neurons), one output layer, purely log sig activation function, and 1000 epochs were utilised.

6) Validation:
Validation measures are utilised to assess the classification performance of the model. This study uses two types of validation measures: accuracy and kappa. The percentage of cells categorised exactly in class I to the total number of cells is called accuracy. The following is an example of accuracy: True Negative (TN), False Positive (FP), False Negative (FN), and True Positive (TP) values will be calculated to obtain accuracy, precision and recall values. The accuracy value describes how accurately the system can classify the data correctly. In other words, the accuracy value compares the data that is classified correctly and the whole data. The precision value describes the number of positive data categories classified correctly divided by the total data classified as positive.
Another indicator of accuracy is the kappa coefficient. Kappa is a measure of how the result of a classification compares to a given value by chance. It can take a deal from 0 to 1. If the kappa coefficient equals 0, there is no similarity between the classified image and the reference image. If the kappa coefficient equals 1, the image is classified, and the ground truth image is identical. Thus, the higher the kappa coefficient, the more accurate is the classification.

IV. EXPERIMENTAL DESIGN
This study demonstrates how data influences the best characteristics. The approach used on this dataset is tested during the classification process, ensuring that the classification results are accurate and based on the best features. The accuracy gained in this study is compared to other experiments to determine the optimum accuracy.
The original image is used to extract features using the feature extraction approach, eq. (2) to (6) using the GLCM feature extraction method. For each image, a total of 60 features are created. These attributes are taken from each colour layer, which includes red, green, and blue. According to equations (1) to (3), GLCM creates 20 texture-based features for each colour received from four distinct angles, with each corner yielding five particular features.
Consequently, the final data consists of 300 photos multiplied by 60 characteristics. The BPNN classification algorithm was used to analyse the feature findings. In this study, each parameter in the BPNN classification technique was evaluated, and an assessment was based on the usage of the backward elimination-based feature selection approach. Backward elimination findings are compared to past investigations, specifically forward selection. www.ijacsa.thesai.org In performance evaluation, sampling and validation processes are employed with multiple tens. The dataset is divided into ten values for cross-validation, with each component evenly distributed. The experiment was repeated ten times according to the number of cross-validations and the average training and performance classification testing outcomes. The classification model's performance is assessed when a confusion matrix is created. This investigation obtained the most notable performance findings using MATLAB (www.mathworks.com).

V. RESULT AND DISCUSSION
The discussion is broken down into three stages: data preparation, feature extraction, and classification (Fig. 2). At each stage of the research, the results are revealed.

A. Segmentation
The segmentation results show segmentation that only takes the green colour from the image. To obtain the image's unique characteristics according to the agreed class will then use the segmented image. Table I is an image of the segmentation results of the three types of labels that have been agreed upon.

B. Feature Extraction
The GLCM technique is used to turn the selected pictures into functional characteristics. A new matrix with 300 images and 60 features is created by combining feature approaches. From Table II each row of the matrix represents data, while the columns reflect the characteristics of each piece of information.
The column comprises the variables F1, F2, F3, F4 to F120, as given in Table I. Columns are variables that hold the values for each GLCM texture feature. The GLCM method's secondmoment angle features, contrast, and correlation with 0 degrees orientation direction on the red layer are F1, F2, F3, and F4. In the GLCM approach, the other Fn characteristics include orientation and colour layers. The kurtosis value on the blue layer is the next F60 feature. The data numbers received from the dataset are represented by the matrices in rows 1 to 300. All characteristics were employed in the classification process in this study. Therefore it's safe to infer that they all help get the best classification results.

C. Evaluation of BPNN Classification based on Parameters
The supporting parameters are used to evaluate the BPNN classification. The test results are shown in Fig. 5, utilising a training cycle value of 10 to 100 cycles.
As shown in the Fig 5, the results indicate that the optimal training cycle was used for 100 cycles with an accuracy of 84.67% and a kappa of 77%. These findings suggest that increasing the number of training cycles increases accuracy and kappa performance. This result is consistent with numerous other studies that employed neural network training cycle settings and showed excellent performance on long training cycles.
The learning rate experiment on the BPNN technique is also applied from 0.1 to 1, utilising the highest performance results from the training cycle, which is 100 cycles (Fig. 6). The results reveal that a learning rate of 0.1 produces the most significant outcomes compared to other learning rates. When applying a high learning rate, the pattern created from the results of the accuracy performance based on the learning rate reveals that performance steadily falls. The solution is not ideal because the high learning rate and performance are reduced [34].  The momentum parameter is used to evaluate BPNN in Fig. 7. The accuracy performance based on momentum follows the same pattern as the learning rate results, with ups and downs, and the most significant results when given a momentum value of 0.1. These findings align with earlier research that claims that a small momentum value brings features closer together without boosting convergence [34].

D. BPNN Classification Testing based on Feature Selection
To choose features in this experiment used the backward elimination feature. The maximum number of eliminations parameters is somewhere between 10 and 60 (shown in Fig. 8). These parameters are applied to the BPNN classification algorithm, utilising the optimal training cycle, learning rate, and momentum parameters from the previous experiment, 100, 0.1, and 0.1, respectively.
The accuracy is 85.67 % when using a maximum total elimination of 10, and backward elimination training delivers significant results. These data show a 1% improvement over the BPNN classification algorithm without feature selection. With a 1.5 % increase, measurements with kappa have the same performance pattern as measurements with accuracy.

E. Evaluation Comparison between Feature Selection
We used the forward selection to acquire the best performance and compare assessments for feature selection (shown in Fig. 9). Forward selection is based on the selection of empty attributes, and each iteration adds unneeded attributes from the quantity of data for each additional attribute, according to the working idea. Only the traits that increase performance the most are added to the selection in the forward selection, which begins with the alteration of the selection. Backward elimination allows for the most significant results since it starts with a complete data set and deletes each characteristic for each repetition. Backward elimination, like the forward selection, uses cross-validation to predict performance and deletes attributes that cause a drop in performance.  The findings demonstrate that when ten features are applied, and the selected features are eliminated via backward selection, the most outstanding performance obtained from both feature selection approaches offers the same accuracy of 85.67%. Using forward selection produced an 85.67% performance with 20 and 60 chosen features, respectively. The backward selection highlights the behaviour of the accuracy findings as much as possible by picking features from a large number of characteristics in the data. It is conceivable that using the feature impacts the findings' correctness. While the forward selection is based on the calculation of the correlation matrix by taking the relationship between features that produce the highest correlation coefficient and only considering the relationship between features, the backward selection is based on the calculation of the correlation matrix by taking the relationship between features that produce the lowest.

VI. CONCLUSION
This study aims to determine the state of plant cover to quantify peatland fire vulnerability. As a result, this research may help some fire-prone countries overcome their problems. This research provides an automated identification approach based on the original image and ambient circumstances. The improved performance is due to the revised flow for determining cover situations. When the feature selection approach is coupled, the findings demonstrate an increase in performance-the proposed strategy results in a 1% improvement in accuracy and a 1.5 % increase in kappa. The rise happened when the feature selection utilised forward and backward elimination features. As a result, many features have suboptimal capabilities, and the feature selection approach can provide native features that are suitable for determining vegetation cover situations. Feature engineering can minimise the number of input dimensions in visual feature extraction while increasing the accuracy of vegetation density categorisation. As a result, it can improve machine learning. Using the Keetch and Byram Drought Index model, it will be more effective to use engineering characteristics in the vegetation density workflow classification system to evaluate peatland fire. The scope of this study is confined to making suggestions about the impact of feature engineering. At the classification step, to enhance accuracy may make further efforts by comparing machine learning classification approaches.

ACKNOWLEDGMENT
This study was completed with partial funding from Riset Keilmuan with the Riset Mandiri Dosen scheme from LPDP through the Directorate of Resources, the Directorate General