Design of an Intelligent Hydroponics System to Identify Macronutrient Deficiencies in Chili

Nutrient contents are important for plants. Lack of macronutrients causes plant damage. Several macronutrient deficiencies exhibit similar visual characteristics that are difficult for ordinary farmers to identify. Collaboration between Computer Vision technology and IoT has become a nondestructive method for nutrient monitoring and control, included in the hydroponic system. Computer vision plays a role in processing plant image data based on specific characteristics. However, the analysis of one characteristic cannot represent plant health. In addition, knowing the percentage of macronutrient deficiencies is also needed to support precision agriculture systems. Therefore, we propose a Multi Layer Perceptron architecture that can perform multi-tasks, namely, identification and estimation. In addition, the optimal architecture will also be sought based on the characteristics of the combination of three features in the form of texture, color, and leaf shape. Based on analysis and design, our proposed model has a high potential for identifying and estimating macronutrient deficiency at the same time as well and can be applied to support precision agriculture in Indonesia. Keywords—Multi Layer perceptron; internet of things; feature combination; leaf image; nutrient deficiency


I. INTRODUCTION
The chili plant is a high economic value of the horticultural plant in Indonesia. However, the production level is lower than the consumption level, with inflation of 0.20% to 0.55% in 2019 [1]. The thing that causes low production is limited land due to the transition of the farming areas to settlements. Another thing is crop failure due to erratic weather changes, pest attacks, and plant diseases [2]. A System that can be applied to limited land in uncertain weather is hydroponics [3].
Hydroponics is a farming system that emphasizes the fulfillment of plant nutrients [4]. A plant needs macro and micronutrients to grow and develop [5]. Macronutrients include N, P, K, Ca, Mg, S (> 1000 mg/kg dry matter) and micronutrients include Iron, Mn, Zn, Cu, Cl, B, and Mo (<100 mg/kg dry matter) [6]. Inappropriate nutrient content causes plants to have macronutrient deficiency. Nutrient deficiency is more easily observed in the leaves [7], [8]. The symptoms in leaves include marginal, interveinal, and uniform chlorosis, distorted edges, reduction in the size of the leaf, necrosis, etc. [9]. However, identifying macronutrient deficiency is difficult for ordinary farmers because several nutrients show similar characteristics [10].
Technology in agriculture is an important part of supporting the application of industry 4.0 in Indonesia [11]. One of its applications is a monitoring and control system in intelligent hydroponics using the Internet of Things (IoT) [12]. One thing that needs to be monitored is the plant's health condition. Computer Vision is one of the technologies that can help the farmer to check the condition of plants [13], [14]. Collaboration between IoT technology and computer vision produces an automatic hydroponic system that can find out the condition of the plant and the solution.
Computer vision technology utilizes image data for analysis. Several images color model have been used, one of them is an RGB image that works like the human eye which is sensitive to the red, green, and blue light bands [15], [16]. Identification of macronutrient deficiencies using RGB leaves images has been carried out. Coffee, tomato, chili, cucumber, etc. have been analyzed using texture and color using K-Nearest Neighbors (KNN), Artificial Neural Network (ANN), Naïve Bayes [17], Multi Layer Perceptron (MLP), and Convolutional Neural Network (CNN) with various architecture [18], [19]. MLP gives a promising result. However, most of these studies only determined the type of nutrient deficiency [20]. To support a precise agricultural system, the percentage of deficiency is important so nutrient solutions accordance with plant needs [21], [22].
The percentage of macronutrient deficiency estimation using plant RGB images has been carried out [16], [23]. Color characteristics in plants have been used as in wheat using Multivariate Linear Regression, Genetic Algorithm, Back Propagation-ANN, and KNN [23], [24]. Texture features have also been developed using the Support Vector Machine (SVM) algorithm [25]. Several studies have also utilized deep learning such as in chili plants using Recurrent Convolution Neural Network (RCNN) [1]. Most of them only use one feature for estimation, so the resulting model is not robust for all types of macronutrients [26]. It because of each macronutrient shows different visual characteristics.
The proposed study is image-processing-based to identify and estimate the percentage of macronutrient deficiencies in a hydroponics environment. There are several phases to determine the lack of macronutrients such as image acquisition, preprocessing, segmentation, feature extraction, identification, and estimation task. The smart hydroponic system is assisted by IoT so that the process of image data acquisition is carried out automatically. Image data is crop *Corresponding Author. data in real conditions so, preprocessing such as histogram equalization is required. Then, the chili plant objects are separated from a complex background using a segmentation technique. One of the challenges faced at the segmentation stage is separating overlapping leaves images. A combination of three features in the form of color, texture, and leaf shape is proposed to support precision agriculture. Careful selection of features must be conducted to obtain accurate models. The main contribution in this paper is identification and estimation in a multi-layer perceptron architecture based on combination features. The study aims to identify and estimates five types of plant conditions such as healthy, potassium deficiency, calcium deficiency, magnesium deficiency, and Sulphur deficiency.

II. RELATED WORK
The study of macronutrient identification and estimation has been done from destructive to non-destructive methods [5], [16]. The destructive method is tested in the laboratory but it is costly and time-consuming [27]. Then, a monotonous and long duration of work would raise a human error [28]. Therefore, non-destructive is needed.
Internet of Thing is one of the non-destructive methods that have been used [29]. IoT can be used to monitor and control physical phenomena around agricultural environments such as temperature, light intensity, pH, and others [30], [31], [33]. Not only observing, but IoT also helps to determines the health condition of plants based on the image data [38]. However, the visual condition of the plant is not considered, even though the condition of the plant is important. Table I shows visual characteristics of several macronutrient deficiencies. A sensor that can perform visual plant condition is camera sensors that usually used for monitoring the agricultural environment [39].
The camera is useful for capturing plant images in a hydroponic environment [34]. Some of the plants that have been observed are tomato, chili, tobacco, spinach, okra, and others [28], [35], [36]. Digital image processing is used to analyze the captured images to obtain plant conditions. Plant conditions that will be analyzed are the type of macronutrient deficiency and the percentage. These stages are preprocessing and augmentation, segmentation, feature extraction, to identification and estimation tasks [22], [37].
Preprocessing uses to improve image quality [48]. Several preprocessing types are an enhancement, resizing, histogram equalization, and others [7]. Histogram equalization has been used to cope with different exposures on captured images [39], [40]. Then, data augmentation is a process for data enrichment. One example of data augmentation is processing rotation and blurring [41]. The study [42], [43] has proven that image augmentation so that each class has the same amount of data increases the accuracy of the model.
Segmentation is process to separate objects from the background, even a complex background [9], [16]. Several methods have been applied for segmentation, from K-Means Clustering, Genetic Algorithm with DSELM, Otsu, masking green, Fuzzy C-Means (FCM) method to thresholding [21], [22], [44]. However, thresholding and green masking methods cannot overcome overlapping leaves. FCM is a clustering method that can group data. In [45], it is proven that FCM can overcome overlapping objects.
Feature extraction can be done using several methods based on the information characteristics. Statistical features in the RGB, HSV, and YUV color models represent color information from objects [24], [32]. While some methods, such as GLCM, Sobel, etc., can represent texture and shape information on leaf objects [17]. Feature combination is important because each deficiency shows a different visual features [9]. There is a nutrient that shows color characteristics, and some do not. Therefore, a combination of 3 features such as color, shape, and leaf texture is needed. A combination GLCM, hue, and color histogram has been used to analyze maize plants [46]. Then, a combination of RGB and Sobel edge improves the accuracy [21], [35].
Several methods have been used to identify and estimate macronutrient deficiencies in plants, such as Rule-based method to deep learning in the different models [47], [48]. The rule-based, histogram, and RGB-based method were used to identify leaf deficiency [48], [49]. However, those methods cannot handle data with high variance. Learning methods such as supervised learning have also been used, such as MLP, ANN, KNN, SVM, and others [24], [50], [51]. MLP shows promising results for data in different lighting conditions [52]. It can be concluded that MLP is a method that can be used to process hydroponic data in a natural environment. Each method is only robust on certain macronutrients. It is proven in [26] study using the derivative function method. The first derivative obtains more accurate results for predicting N, P, and S. Nutrient such as Mg, and K in plants uses logarithmic transformation while Ir uses smoothed reflectance. Br and Mn use the first derivative, while Zn uses the second derivative. In addition, CNN with various architectures has been used. However, it is a black box method [2], [8], [53] so it can be challenging to analyze the effect of feature combinations on the resulting model. Table II shows a summary of some of the studies that have been mentioned. Based on some of the literature reviews above, it can be concluded that IoT and Computer Vision-based in smart hydroponic farming can help farmers find outcrop conditions quickly. In addition, methods for analyzing image data are essential, such as MLP, which can process image data with different lighting. Then, MLP performs the best result with 88,33 accuracy than logistic regression and SVM in [14].
139 | P a g e www.ijacsa.thesai.org Several features can be used to analyze, such as histogram value, HSV statistic value, texture using GLCM, and others. But, several researches above shows that the types of features analyzed from the image data need to be combined to improve the quality of the model so that the model can be robust across various types of macronutrients.

III. PROPOSED SYSTEM
This section discusses intelligent hydroponics system workflow and Computer Vision in Hydroponics systems. The discussion of the two parts of this model answers the research question.

A. Smart Hydroponics System Workflow
Overall, there are three steps in a hydroponics system such as data acquisition, data analysis, and user area. Fig. 1 presents a schematic of the proposed model. Based on Fig. 1 below, Part A is the data acquisition stage. Part B contains a process to identify and estimate macronutrient deficiency in the plant. Then, part C is an interface that connects farmers with hydroponic systems so plants and agricultural environments can be observed and controlled well. The hydroponic system runs automatically. The detail of each part is discussed below: • Data Acquisition: There are two types of data in the hydroponic system: agricultural environmental and plant conditions data. The environment data includes light intensity, humidity, temperature, and water levels.
While plant conditions are plant image data. Each sensor acquires different data. Then via Raspberry Pi, data is sent to the server using Wi-Fi and other protocol.
Image plant data is taken in the morning and afternoon to avoid distraction by lighting conditions. Image data is taken using a camera sensor with a certain distance.
• Data Analysis: Data is grouped into a database according to their needs. Plant image is used to test the identification and estimation model that has been built using digital image processing and machine learning. The result is plant condition and their percentages. These results are used as input on the Nutrient Control to give nutrient solutions according to plant needs.
• Web Development: Hydroponic runs automatically, but each process can be observed on the web development interface that can be accessed through computers or smartphones. So, users can observe the ongoing process and control the agricultural environment.

B. Identification and Estimation Process
Image data are analyzed using digital image processing. Fig. 2 shows a flowchart of identification and estimation of macronutrient deficiency. The stages of digital image processing are divided into five stages. The first stage is preprocessing using histogram equalization and extraction of three types of features. Features are selected and combined with high considerations so that the analyzed features can represent the condition of the plant, then, training to determine the percentage of visual characteristics match. The percentage of visual characteristics becomes the input for the MLP training stage. The output is an appropriate identification and estimation model. In this proposed method, there are two types of training data. The first train is to determine the percentage of matches based on visual characteristics. Then, the second train uses the percentage of matching data obtained from the first training for identification and estimation tasks. Our model develops a system that can identify macronutrient deficiencies and estimate the percentage of deficiency at one time. In addition, we are also looking for the suitable MLP architecture based on the combination of three features used, namely color, shape, and leaf texture. The model is evaluated using two different types of evaluators based on their tasks. The last is the resulting model will test in a natural hydroponic system environment that has been build. Each stage is described in detail in the next section.

IV. EXPERIMENTAL RESULT AND DISCUSSION
This section discusses the detail process of image processing in hydroponic systems. The camera acquires the image data in natural environment. Fig. 3 shows some results of image acquisition of chili plants. Healthy plants are given different treatments by reducing certain macronutrients. After 1 week, the plants will show visual characteristics. The planned amount of data taken is 500 for each class. The data includes data with a percentage label of the deficiency. 141 | P a g e www.ijacsa.thesai.org 1) Preprocessing and feature extraction: Each training and testing data are treated the same at this stage. The result of the acquisition is an RGB image with different lighting. Histogram equalization is applied to equalize the different exposures in the image. Where the results of the conversion of RGB to HSV images are transformed into histograms so that they can be normalized using the Cumulative Distribution Function (CDF) as shown in (1) where is the histogram value and is the total [51]. Then the intensity transformation is carried out from the input image to as shown in (2) with a pixel of level L. (1) The result of Histogram equalization shown on Fig. 4 It will be segmented to separate objects from complex backgrounds using FCM. FCM is a clustering method that can be used for abstract data. In this case, FCM is used to overcome overlapping leaves. The basic idea of FCM is to divide n pieces of data into non-unique sets to improve cluster data based on membership degrees, where the membership degrees are real numbers in the range [0,1] [52]. To select the leaves that are not overlapping ROI is applied. Segmentation and ROI result are several single leaf images whose leaves do not overlap as shown on Fig. 5.
Feature extraction is performed for each single leaf image. In this study, three features used are: shape, texture, and leaf color. These traits were chosen based on the leaves visual characteristics. Each feature is processed using a different method. Color feature extraction using HSV values, texture feature extraction using Gray Level Co-occurrence Matrix (GLCM), and shape feature extraction using Canny edge detection with Freeman Chain code. The output of color feature extraction is Mean value in (3), Standard deviation in (4), and skewness in (5) for each H, S, and V [48]. Where μ2 4T is Mean, σ2 4T is Standard Deviation, M is image dimension based ith pixel, N is the total number of j-th, and I ij is value of the jth pixel of the image at the i-th color channel.
While the GLCM output is two parameters, such as Angular second moment (ASM) in (6) and Inverse Difference Moment (IDM) in (7) [17]. In shape feature extraction, it is still a chain code. Chain code can be converted into 15 normalized Elliptic Fourier Descriptors based on the Chain code result.
2) Training to get matching percentage of visual characteristics: The results of each feature extraction are trained using NN-Backpropagation to check the matching percentage of characteristics. The output of NN-Backprop is the percentage matching characteristics which grouped into specific types, as shown in Table III. For example, for Nitrogen, The possible matching percentage is shown in Table III. The output of NN-Backprop is input into the MLP architecture, so the type of nutrient deficiency and its percentage are known.
3) Training to get identification and estimation model using mlP: The model proposed in this study is an MLP architecture that can perform estimation and identification simultaneously. The model has two output types. They are numerical predictions and class predictions. Fig. 6 shows the proposed MLP architecture.
The type of MLP developed is fully connected. Each neuron is accumulated using a certain activation function where Wi is the weight of the i-th data, Xi is the i-th data, the value of b is the bias, and y is the output. Neuron input consists of nine values of chili plant matching percentage normalized with a range of [0-1]. After normalization, initialized weights and biases are given for each input neuron to the hidden layer with values 11 , 12 ,.., , and 11 , 12 ,…, [19]. The decision layer has three neurons, where two neurons are for identification and one neuron is for estimation. The MLP output representation uses a binary output with the number of classes is 2 where n is the number of neurons.   The activation function in the hidden layer is different from the output layer. Each neuron in the hidden layer uses the Rectified Linear Unit (ReLU). ReLU makes a limit on the number zero, if x ≤ 0 then x = 0 and if x> 0 then x=x as shown in (8) [14]. Meanwhile, in the decision layer, two different activation functions are used. A numerical prediction has single node and SoftMax activation function [5]. SoftMax not only maps the outputs to the range [0,1] but also maps each output with the total sum is 1. Therefore, SoftMax's output is suitable for the binary classification problem [0,1] as shown in (9). The Sigmoid activation function as shown in (10) is used for the estimation [58]. The sigmoid output normalized to obtain the percentage of macronutrient deficiency in the range 0%-100%.
In the machine learning method, the loss function is used to optimize the model during training so that the error is minimum. In identification task, the loss function that will be used is Categorical Cross Entropy that can be shown in (11) where is the loss category entropy, is the target value and is the result of the Softmax. A good model has close to 0. However, estimation task uses using Mean Square Error (MSE) [59].
The number of hidden layers used affects the learning process. More hidden layers are used, the deeper features are studied. In the proposed model, the exact number of hidden layers and nodes of MLP is sought so that the optimal architecture is produced based on the three features combination. The effect of the number of epochs, learning rate, and others are considered.

4) Find match percentage of testing data:
The test data that has been through the preprocessing process is searched for the matching percentage of the visual characteristics using the NN-Backprop model from the previous stage. The expected output is to know each matching percentage of the visual characteristics.

5) Identification and estimation macronutrient deficiency:
The matching percentage of its characteristics data is tested on the MLP model to know plant condition and its percentage. An evaluation is carried out to find out the model's performance against the test data. The identification task is evaluated using the confusion matrix to get the accuracy and precision based in (12) and (13) [9,14]. The rule of a confusion matrix is shown in Table IV. While the estimation task is evaluated by calculating the error using Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE). Predictive result are good if the MAPE value is less than 10% [58]. Meanwhile, for MSE using a gradient-based method, a lower value makes a better prediction. The formula for calculating MAPE and MSE can be observed in (14) and (15) where Yt is the actual value of period t, Y't is the forecast value of period t, and n is the number of periods.
The intelligent hydroponic system can identify nutrient deficiencies and provide solutions automatically. Based on the analysis and literature study that has been done, the proposed 143 | P a g e www.ijacsa.thesai.org model potential is to be applied. The MLP architecture performs two different tasks, namely, classification and regression. In addition to developing a multi-task architecture, the number of hidden layers and nodes will find based on the combination of the three features used.

V. CONCLUSION
This research is in the stage of data collection and software development. Intelligent hydroponics hardware has been built and is still being evaluated based on system requirements. This proposed model answers research questions about identifying and estimating macronutrient deficiency in intelligent hydroponics systems. First, identification and estimation are faster and more precise so that the provision of plant considers the needs of plants. Second, three features combination such as color, shape, and leaf texture is applied to increase information so that the system can identify the right type of nutrient. Third, build a model that can perform both identification and estimation tasks. Fourth, it analyzes more than one type of nutrients to be applied in the real environment. In the future, this model can be implemented in the natural environment of intelligent hydroponics systems in Indonesia. Limitations still exist because the model is only implemented in chili hydroponics plants.
ACKNOWLEDGMENT This work is funded by the Ministry of Research and Technology of the Republic of Indonesia in the PMDSU program. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.