IoT based Date Palm Water Management System Using Case-Based Reasoning and Linear Regression for Trend Analysis

Palms trees (Phoenix dactylifera L.), Al Nakheel in Arabic are known to have cultural and economic importance to Gulf and Arabic-speaking countries. However, using the traditional method of cultivation, improper use, and depletion of water is perceived as the major challenge as farmers used almost two and a half times the required amount without considering numerous factors. This paper attempts to develop an implementation model of a water management system for Date Palm Trees using Case Based-Reasoning. The said model involves an IoT-based module comprised of NodeMCU, soil moisture, temperature, and humidity sensors that automate the settings of the water amount for the whole year based on palm age, temperature, air humidity, and soil moisture. CBR calculates the amount of water supplied to palm trees (based on initial knowledgebase cited from previous empirical studies) and stores it in a cloud-based database. These data and hardware status can be accessed using a mobile application. When the temperature or soil moisture sensor fails, data trends are retrieved from the database and processed using Linear Regression Analysis. The test results have shown that the proposed model helped in a significant decrease in water consumption compared to the traditional method. Keywords—Date palm tree; case-based reasoning; IoT; mobile application; NodeMCU; water management system


I. INTRODUCTION
Palms trees (Phoenix dactylifera L.) also known as Al Nakheel in Arabic are considered as one of the most famous plantations in Gulf countries, such as Oman and Saudi Arabia, and some Arabic-speaking countries in Africa such as Egypt [1]. There is also a prevalence of the tree outside the Arab countries such as Spain, Australia, and the USA [2].
Besides oil and gas, date palm products are one of the income sources of Oman. Date palm is the primary agricultural crop in the country, and it constitutes 80% of all fruit crops produced and 50% of the total agricultural area in the sultanate. Oman is the eighth largest producer of dates in the world with an average annual production of 260,000 metric tons [3].
As part of Oman's 2040 vision and national priorities, the government would like to diversify its sources of income and not rely alone on fossil fuels [4]. To contribute to this vision, the Ministry of Agriculture and Fisheries spearheaded the -One Million Date Palm Trees Project‖ aiming to revitalize the agriculture sector to enable the country to achieve food security and to drive the economy [5].
However, in achieving this, the Ministry faces many challenges. According to Al Marshudi [6] as cited in Al Yahyai et al. [3] Date production in Oman is still traditional from irrigation to the methods of applying fertilizers. Due to a subtropical dry, hot desert climate with low annual rainfall and very high temperatures in summer, the insufficiency of quality and quantity of water is the major concern not only of Dates farmers but the rest of the agricultural sector [3]. Water management in farming is a major challenge according to the Middle East Desalination Research Centre (MEDRC) [7]. This paper attempts to develop an implementation model of a water management system for Date Palm Trees using Case Based-Reasoning. The said model involves an IoT-based module that automates the settings of the water amount for the whole year based on palm age, temperature, air humidity, and soil moisture. CBR calculates the amount of water supplied to palm trees and then stores it in a cloud-based database. When the temperature and soil moisture sensors fail, data trends are retrieved from the database and processed using Linear Regression Analysis.

II. REVIEW OF RELATED LITERATURE
There is a wide array of past and current studies on water management in various contexts and parameters. Recent papers published are categorized either into IoT-based with no artificial intelligence approaches involved or the combination of both (with emphasis on fuzzy logic). This paper also cited the latest studies on the application of case-based reasoning to irrigation systems.
Qomaruddin et al [8] proposed a watering system for greenhouses involving air temperature, air humidity, and soil moisture as input parameters. For the user to access the system and control the supply of water remotely, the MQTT (Message Queue Telemetry Transport) protocol bridges the user's mobile phone and the Wemos D1 Uno microcontroller. The microcontroller processes inputs from the sensors and feeds the data to AdaFruit, a third-party IoT webpage that users access using their smartphones. Since there is neither algorithm nor complete automation involved, the user has to perform decisions based on the data displayed on the website and water the plants accordingly. A similar study utilizing the AdaFruit webpage for IoT aimed to predict plantation and crops' health and then notifies farmers through emails. The Arduino, connected to WIFI using third-party hardware, collects data from soil moisture, pH, flame, and humidity sensors. These data are then processed based on a rule-base using the FindS www.ijacsa.thesai.org algorithm to predict the plants' health. Based on the health status, the farmer decides the right amount of water supplied to plants [9]. Similarly, the apparent drawback of the study is the necessity for the farmer to regularly check emails and be physically present to water the plants.
Sweety et al. [10] suggested the use of a rule base in control of the watering process in gardens. Their prototype involved a PIC microcontroller-based module interfacing temperature, humidity, and soil moisture sensors that enable the users to control the motor via Bluetooth. The motor pump turns on when the temperature level is between 35 and 40 degrees centigrade, humidity is more than 35%, and soil moisture is above 100MA. There is neither clear basis nor references cited on the threshold values used, and the user must be within the vicinity to perform the control of the system since Bluetooth has a limited distance covered.
A group of researchers [11] from the Universitas Klabat in Indonesia found out that automating water supply to plants maintained the soil moisture at an average of 62%. Their prototype consists of soil moisture sensors, Wemos D1 Microcontroller WiFi enabled, Relay, and solenoid valve. Watering starts when moisture is detected to be between 30-35% and then stops afterward. The Blynk Apps, a third-party IoT platform, monitors the status of the microcontroller and feeds data to ThingSpeak, another IoT Platform for storage. Users must log in to these platforms to view data. The study suggested the development of a dedicated software application as part of future work as it relies heavily on third-party platforms. There are no clear bases to support the claim that 62% moisture is the ideal level for sustaining plants' water needs. Also, the study admitted the absence of temperature and humidity sensors as a limitation. Likewise, another project [12] employed ThingSpeak functioning as a cloud server to record all the data and link the hardware prototype with the android application to irrigate plants, flowers, and crops. Using three microcontroller modules (Arduino UNO, NRF24L01, NodeMCU) communicating with ThingSpeak through WIFI, the user can monitor the performance of the system. The said microcontrollers turn on the pump when sensors detect 20% moisture until it reaches 71% level. The study was successful in achieving and maintaining the desired soil moisture level. However, no pieces of literature support the indicated maximum moisture level as the ideal threshold value. Wahid et al., [13] and Ahmmad et al., [14] proposed the same project using NodeMCU Lua and soil moisture, humidity, temperature, and light sensors.
Ying and his colleagues [15] from the Universiti Tun Hussein Onn Malaysia developed a web-based water management system for oil palm nurseries written in PHP, enabling users to obtain reports on the right amount of water required by palm trees. The user inputs the parameter values for the rainfall and watering time, analyzed using Fuzzy Logic. The MySQL database stores the data for later retrieval. The project has no hardware implementation. Also, there was no indicated basis for the fuzzy rules. Similarly, a MatLab simulation of a greenhouse irrigation system involving Mamdani Fuzzy Logic and temperature, humidity, wind speed, and radiation influence as inputs helped to achieve efficient energy consumption, desired soil moisture level compared to an ON/OFF controllers, and a cost-efficient prototype [16]. On the other hand, university research collaboration in Kenya found out that the Sugeno inference system is better than the Mamdani in analyzing relative humidity, temperature, sunshine illumination, solar radiation, and wind speed to infer the desired amount of water. The MatLab simulation determines the duration of the revolution of the motor pump leading to more efficient water and energy consumption as compared to an ON/OFF system [17]. Similarly, Oubehar et al., [18] proposed a Matlab-based intelligent control for greenhouse using ANFIS technology.
A recent study by Zhai et al. [19] proposed a case-based reasoning model accurately predicting the amount of water for Grape farming. The model involved various parameters such as solar radiation, temperature, humidity, wind speed, rainfall, precipitation, soil moisture, and grape growth stage. Moreover, the said model is ideal for hardware and software implementation. Another similar study by Krongtripop et al. [20] proved that CBR-based water control systems implemented in mixed gardens are better than the traditional time-based approach in terms of water conservation and the height of the trees. The prototype involved temperature and soil moisture as parameters to be fed into the Arduino Uno R3 Microcontroller connected to WIFI using third-party hardware Xbee Wireless Module. Data are collected and stored on a computer server. A desktop application written in C# controls the system and queries the database for the CBR process. The perceived drawback of the study is the need for the server to be in the same proximity as the hardware module, hence requiring physical human intervention.
There are few recent studies on water management using case-based reasoning, and mostly they are introducing the Mathematical model for adoption. Fuzzy Logic is a commonly used algorithm for predicting values based on some input parameters. However, despite its popularity, there are known drawbacks to this algorithm. A critical review of Behrooz et al. [21] enumerated the following disadvantages: (1) stability is not assured (2) designers must perform a series of trials and errors to achieve optimal output and (3) the presence of several tuning parameters. Moreover, in Fuzzy Logic, all parameters are treated with equal weights, and the process is confined to a particular use case, thus less dynamic as compared to other algorithms [22].

A. Requirement Analysis and Locale
The Ministry of Agriculture, Fisheries, and Water Resource (MoAFWR) of the Sultanate of Oman is the agency having jurisdiction over the cultivation of Date palm trees. The proponents collaborated with an expert from the MoAFWR about the traits, water needs, and other parameters (which were not covered in this study) involved in growing the said palm trees. The Ministry also confirmed that no similar project had been proposed or currently used in the Sultanate. Also, the same expert performed the testing of the proposed prototype.

B. Proposed Architecture
The proposed architecture, as shown in Fig. 1, consists of seven major components. The NodeMCU that has an on-board www.ijacsa.thesai.org WIFI module collects information about humidity, temperature, and soil moisture levels, which are then processed by a web-based Case-Based Reasoning (CBR) system written in PHP. The Apache Web Server housed these PHP scripts that transact with the MySQL Database Server containing all the sensor reading and water levels supplied based on the CBR process. The mobile application (running within the Android operating system) developed using Android studio enables the user to monitor the hardware status (see Fig. 2), sensor readings, and water level trends (parsed in a JSON array form) from the MySQL database. Similarly, the user can monitor the status of the hardware components (e.g., the microcontroller and the sensors), which the Firebase online database stores in real-time.  Within the same application, users can input Date palm tree details such as age and watering frequency. Fig. 3 shows the mobile application interface for user inputs.
For testing purposes, the prototype used free web-based tools and database services.

C. Major Hardware and Software Components
1) The hardware schematic diagram: The hardware prototype has six (6) major components. Fig. 4 shows all the components numbered and whose descriptions are found in Table I. The prototype, as much as possible, is consists of generic components with specifications listed in Table I. All hardware components are interconnected using a breadboard with exceptions to some that require jumper wires. There are two power sources in the prototype, as indicated in Fig. 4. The USB Micro-B Cable powers up the microcontroller and the sensors. Since the voltage coming from the output pin of the microcontroller cannot directly supply the submersible water pump, the relay module routes the power from the external 12 volts power supply to the pump.  At the initial stage, the WIFI SSID and key are manually embedded into the code to enable the microcontroller to connect to remote databases in real-time. The NodeMCU collects information from the temperature/humidity and soil moisture sensors and sends it to the webserver for CBR processing (or Linear Regression calculation in case one of the sensors fails) and in the MySQL database for storing through the HTTP POST protocol. Fig. 5 shows the partial code.
The web server returns CBR results to the NodeMCU, which then sends a signal to the output port that triggers the relay module to activate the submersible water pump.

D. Input / Output Parameters, Threshold Values and Database Design
The CBR calculation involves three inputs: temperature, humidity, and soil moisture. For brevity, Fig. 6 shows a partial NodeMCU code for reading from sensors, in this example is temperature.  CBR takes the temperature values supplied by the sensor directly. However, for better presentation, the soil moisture value is normalized using (1) and converted to a percentage form.

( )
The CBR calculation returns the water amount in liters. In this situation, there is a need to convert liter to revolution time. The water pump, as indicated in Table I, supplies 240 liters of water in an hour. Thus, to obtain revolution time: The Center for Study of the Built Environment (CSBE) [23] in Jordan reported that young Date palm trees consume 20 to 25 liters, and the mature ones consume around 40 liters of water, respectively. These amounts of water enable palm trees to yield more fruits (in addition to soil fertility) and become healthy. The frequency of supplying water to Date palm trees varies depending on the age of the tree. Table II shows the findings of Al Hyari [24].
The baseline or initial knowledge base for CBR, to be used for the Retrieval Process (Global and Local Similarities Calculations), were taken from the findings of Bhat et. al. [25] based on the Penman-Monteith Equation. Daily prescribed water consumption and calculated daily evapotranspiration are combined to obtain an adequate amount of water to palm trees and to avoid loss due to evaporation. Table III shows the newly calculated data with other relevant parameters and soil moisture is set to 0%.

E. Application of Case-Based Reasoning
Case-Based Reasoning (CBR) is a machine learning algorithm [26] that manifests human expertise and can work effectively with criteria-based comparison [27] of the new cases and the existing solved problems stored in the knowledge base [28]. CBR, as shown in Fig. 7, has four major phases: Retrieve, Reuse, Revise, and Retain. These phases are done in an iterative fashion adding new solved problems to the knowledge base, thus making it more intelligent [29].
The retrieval phase is the center of CBR. Existing cases retrieved from the knowledge base are compared to the new problem using Local Similarity (LS) and Global Similarity (GS) calculations. LS comprised is used to break down the problems' attributes and compare them to existing ones in the database. These attributes can be Discrete (3) or Continuous (4) in nature and take different calculation approaches [30].

Discrete Values
where, a is the new feature, and b is the previous feature Continuous Values where, a is the new feature, b is the previous features, and range is the value of the difference between the upper and lower boundary of the set.
Global Similarity GS (5), on the other hand, takes the build-up of all the local similarities and is used to make a generalized comparison of each problem. This paper expresses global similarities in percentages. where, A is the new case, B is the previous case, a is the new feature from the local similarity Once CBR found an ideal solution in the knowledge base, it proceeds to the Reuse stage to adopt it, or to revise, wherein users formulate the necessary solution to the new problem. This new problem and solution are added to the knowledge base during the Retain phase. Fig. 8 shows the CBR implementation in PHP.

F. Linear Regression Analysis
The study used Linear Regression Analysis as a backup mechanism in a situation wherein one of the sensors fails. Setting one of the input parameters to a null value, CBR exhibits faulty behavior affecting the integrity of the knowledge base. The results of the Correlation analysis show that temperature has a "strong positive" relationship (r=0.799) with the water amount. Reversely, humidity and soil moisture have negative strong (r=-0.878) and negative, very strong (r=-0.951) correlations to water amount, respectively. With the recorded sensor reading and the amount of water supplied for the current month, Linear Regression Analysis predicts the right amount of water, using temperature as the predictor.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 2, 2022 554 | P a g e www.ijacsa.thesai.org

G. Database Design
There are three (3) database tables used in this study: (1) for hardware status information, (2) palm tree information, and (3) CBR knowledgebase. Fig. 9 shows the Firebase table structure that stores the real-time status of the hardware components such as the soil moisture, temperature, and humidity sensors. The data is displayed in the mobile application for monitoring purposes.     Fig. 11 shows the knowledge-base table where the actual reading of the soil moisture, temperature, and humidity sensors was recorded. The CBR algorithm depends on these data.

A. Test Case
Table IV contains the test case dataset. In this test case, the FreeCBR tool expedites the process of the CBR calculation and presentation of results. For clarity, the mature palm tree group was used in the test case.
For demonstration purposes, Test Case 1 is processed with FreeCBR. The CBR calculation in Fig. 12 shows that Test Case 1 has a 42.25% similarity with Case No. 11 stored in the knowledgebase. Therefore, the water amount for Test Case one will be (6): The succeeding figures show the CBR results (Fig. 13) and data trends (Fig. 14) in the mobile application.

B. Test Results
The conventional method of cultivating Date Palm trees follows the same way as the other Gulf and Arab countries. Farmers supply an average of 40 liters of water a day regardless of the soil moisture condition, as suggested by CSBE [23]. The joint research of the Environment Agency Abu Dhabi (EAD) and New Zealand [31] revealed that two and a half times the intended water amount is supplied to date palm trees using the traditional way. That is equivalent to 100 liters of water.   With the assumption of 40 liters of water usage in a single watering instance, Table VI shows the amount of water saved using the proposed model.
Using the CBR approach, water consumption significantly (t=-9.621, p<0.05) dropped down to 52% as compared to the traditional method.

V. CONCLUSION
Various literature indicates that date Palm trees have cultural and economic importance to Gulf and Arabic speaking countries. However, using the traditional method of cultivation, depletion of water is perceived as the major challenge.
Various studies attempted to propose prototypes and implementation models, but none of them provided semi-full automation and with lesser human intervention. Most of them provide software simulation and theoretical calculations without actual implementation. Although Fuzzy Logic is commonly used for automation similar to the study, it posed many challenges in the implementation. Also, there is a lacking basis to support threshold values for the rule-based algorithm used by previous studies to calculate the amount of water.
The proposed model had overcome the limitations found in previous studies by citing empirical studies identifying the various factors affecting effective date palm trees cultivation and using them as the initial knowledge base. The input parameters are processed using Case-Based Reasoning and Linear Regression for situations wherein one of the sensors fails to function. This backup feature is not present in the existing studies. The test results show a significant difference in water consumption, thereby helping to address the improper use of water resources.
The results of the study are confined to three parameters: soil moisture, temperature, and humidity. Other parameters such as sunlight intensity, rainfall, wind speed, etc., can be included. Also, the case-based reasoning table should be modified to accommodate other crops, hence making the system more dynamic.