A Novel Method for Rainfall Prediction and Classification using Neural Networks

In the field of food production, it is an important and difficult job to maintain water sources for major population centres and reduce the risk of flooding, to forecast rainfall reliably and accurately. Accurate and genuine forecasts of rainfall on monthly and seasonal time scales help to provide beneficiaries with knowledge on the control of water supplies, farm forecasting and integrated crop insurance applications. Present rainfall prediction is the challenging task for the researchers and most of the rainfall prediction techniques are fail in accuracy. For this we propose a new effective hybrid approach for forecasting and classifying rainfall using the neural network and ACO method. The collected rainfall data were preprocessed by filling missing data and normalized by min-max normalization, the processed data is given to various classifiers for evaluating its performance. The performance of the existing and proposed models is compared. Performance comparison of existing feed-forward, cascade-forward and pattern recognition NN classifier and the proposed ACO+feed-forward backpropagation, ACO+ cascade-forward backpropagation and ACO+ pattern recognition NN classifier are done. The entire HNN forecasting protocol consists of pre-processing and choosing the input vector and maximising the number of hidden nodes using ACO and ANN modelling. Keywords—Pattern recognition; ant colony optimization; artificial neural network; rainfall prediction; feed-forward; cascade-forward; data processing


I. INTRODUCTION
Precipitation, which is heavily dependent on space and time, is one of the most dynamic atmospheric systems. Knowledge of the rainfall phenomenon remains a challenging problem and it is also a vital job to efficiently and correctly predict rainfall, since it is important to preserve water supplies for major population centers in the food production sector and to minimize the risk of floods [1]. Reliable and true rainfall predictions help to provide beneficiaries with information on water supply control, farm planning and their integrated crop insurance applications on monthly and seasonal time scales [2,3]. In recent days, different methods have been suggested for rainfall prediction. These approaches, such as dynamic and empirical, are split into two different groups. General circulation models (GCMs) are usually based on the laws of physics and are a form of dynamic model used for climate forecasting [4].
GCM models, however, are highly complicated, so empirical models are often used to forecast monthly and seasonal precipitation during their usage in agricultural forecasting [5]. Predictive simulations are usually based on the theoretical relationships of predicate variables with various other predictors. The idea is to work out the spatial and temporal characteristics of past reports of rainfall and forecast variables in this system and to model expected rainfall. Traditional numerical model rainfall prediction approaches are carried out in order to test uncertain parameters by considering some kind of appropriate attribute through the application of regression or some other methods of data optimization [6]. The analytical techniques include statistical simulations and artificial neural networks [7,8]. Although ANN is the recipient of the simulation phase of the physical system, this model is not increased in terms of its precision as the time series is gradually non-stationary and when the hydrological function works on a broad time scale [9]. The pre-processing of input and output is achieved by normalization and function selection in order to deliver better performance in ANN models.
In this paper we propose a new effective hybrid approach for forecasting and classifying rainfall using the neural network and ACO method. The collected rainfall data were preprocessed by filling missing data and normalized by minmax normalization, the processed data is given to various classifiers for evaluating its performance. The performance of existing and proposed model is compared. Performance comparison of existing feed-forward, cascade-forward and pattern recognition NN classifier and the proposed ACO+feedforward backpropagation, ACO+ cascade-forward backpropagation and ACO+ pattern recognition NN classifier are done. The entire HNN forecasting protocol consists of preprocessing and choosing the input vector and maximizing the number of hidden nodes using ACO and ANN modelling.

II. LITERATURE SURVEY
With the support of the k-means algorithm, a two-step prediction model, mixed Neural Network, and neural network was proposed by Chatterjee et.al [10]. This contrasts the proposed model with the MLP-FFN classifier. The data source is the Dumdum meteorological station. The proposed model outperforms by achieving 84.26 percent accuracy without feature selection and 89.54 percent accuracy with feature selection. Graf et. al. [11] suggests a dual mix of discrete wavelet transforms and an artificial neural network for water temperature forecasting. It is applied in Poland on the Warta River. The findings of the implementation showed that WT-ANN models were used in measuring river water temperature. The combination of one-dimensional Convolutional Neural Network (Conv1D) and Multi-Layer Perceptron (MLP) is suggested for regular rainfall prediction by M., a hybrid deep learning approach. Both I. Khanand R.aMaity [12] are then contrasted with Multi-Layered Perceptron (deep MLP) and another machine learning solution, Support Vector Regression (SVR), and the findings indicate that the hybrid Conv1D-MLP model is more efficient. The bootstrap aggregated classification tree-artificial neural network (BACT-ANN) model for predictive rainfall prediction is a hybrid model focused on artificial intelligence [13]. It is being introduced in the Langat River Basin of Malaysia. The stochastic secret form of Markov, which is non-homogeneous, is compared to (NHMM). The comparative findings suggest that the paradigm of BACT-ANN outperforms. With the assistance of the hybrid approach of two pre-processing techniques (Seasonal Decomposition and Discrete Wavelet Transformation) and two feed-forward neural networks (Artificial Neural Network and Seasonal Artificial Neural Network), Anh et.al., [14] proposes a monthly rainfall forecast-processing is performed, and the pre-processing performance is provided to the two-feed forward NN. The results of the success suggest that the wavelet transformation offers accurate forecasting of monthly rainfall along with SANN. For the prediction of typhoon precipitation and groundwater level change, a hybrid combination of artificial neural network (ANN) and multiple regression analysis (MRA) is proposed by Hsieh et. al. [15]. In the Zhuoshui River Delta, it is being introduced. An overall accuracy of 80% is obtained, and this technique is effective in predicting typhoons before they occur. Dhar et. al. [16] utilises the advantages of the deep neural network for weather forecasting. This is enforced by supplying the senses of DNN of such attributes such as temperature, relative humidity, vapour and friction. The output prediction result is accurate with respect to the input data given. Roshni et. al. has developed a Feedforward Artificial Neural Network (FFANN) and the hybrid WANN model to model spatio-temporal fluctuations in groundwater levels [17]. The efficiency of these hybrid models is calculated using goodness-of-fit. The experiment showed that the performance of the GT-WANN hybrid model Chen et al. [18]. The findings obtained from this experiment are evaluated and correlated with parameters of Root mean square error (RMSE), Nash Sutcliffe performance (NSE) coefficient and Pearson correlation coefficient (R). A Hybrid Particle Swarm Optimization (HPSO) and Genetic Algorithm (GA) to resolve limited optimization effects and local convergence [19]. It used three techniques, Elitist strategy, PSO strategy and GA strategy, and these strategies are added to the RBF-NN design. Compared to pure GA, the experiment is measured, and the outcome obtained has a greater ability to explore internationally and avoids premature convergence. He et al. invoked the X.A Hybrid Wavelet Neural Network (HWNN) model [20]. The artificial neural network (ANN) models for this HWNN have been combined with Multi Resolution Research (MRA), Mutual Information (MI) and Particle Swarm Optimization (PSO). With 255 rain gauge stations, the HWNN experiment was conducted where positive results were obtained at inland stations in south-east and west Australia. C.C Young et al used a physically based and machine learning hybrid approach [21]. This physically dependent and machine learning hybrid method was used during severe typhoon incidents. This process was evaluated with a total data set of 1200 in terms of model calibration and validation for seven storms. The effects of the experiment were correlated mechanically with the hydrological model, the artificial neural network, and the vector support framework to predict hourly runoff discharges in the Chishan Creek basin of southern Taiwan. S has stimulated a hybrid artificial neural network. Asadi et al [22]. This experiment utilised data preprocessing mixture methods, genetic algorithms and the Levenberg Marquardt (LM) algorithm and findings obtained from the hybrid artificial neural network relative to Artificial Neural Network ( The SARIMAX model is an ANN model that finds that the non-linear relationship between the fitted linear SARIMAX residuals is related to T. Creative Feed forward back propagation (FFBP) of T. The ANFIS theory and the key time series for the calculation of rainfall runoff discharges in Azerbaijan Partal et al. [25], Role of radial base and transition of wavelets. This Wavelet Neural Network has been used to assess daily precipitation predictions and offers expertise to model the function of the physical phase.
A. Background Methodology 1) Data transformation: Data transformation is the stage at which the data is converted into an appropriate form for its usage in data mining. As the first step, the data is converted from hard copy to soft copy. In the next step the tables that are separated for 24 stations are combined into one dataset. Here we have expressed rainfall in terms of 0 and 1. Zero is used if the amount of rainfall is lesser than or equal to 0.1mms and one is used if the amount of rainfall is greater than 0.1mms. Finally, the data is saved in CVS (Commas Separated Value) file format.
2) Data pre-processing: The obtained data consists of some noise and there exists some lost data values and some unwanted data. So, our task here is to clean the data by removing the unnecessary information and by filling the lost data values. At this stage, a stable format for data model was constructed by adding the lost data, recognizing the duplicate data, and removing the bad data. This process makes use of PCA algorithm that takes care in replacing all the lost data values in the dataset.
3) Feature selection: The collection of features is typically an important step in data processing prior to www.ijacsa.thesai.org implementing the learning algorithm. The decreased dimensionality of the attribute would lead to a better interpretation of the model and make it easy to use the various visualization tools, and it is a task in which all obsolete data and unnecessary information are found and extracted if possible. It helps to reduce the dimensionality of data and makes it easier and more effective for learning algorithms to operate. In the future classification, the precision of the findings will be improved. It determines the minimum set of attributes in such a way that the resulting probability distribution is identical to the initial distribution of the data groups.ACO dependent role collection is used here.

4) Data normalization:
One of the steps of data preprocessing is data normalization. Normalization is introduced here. It is used because normalization is commonly assumed to enhance the precision and efficacy of mining algorithms with the intervention of distance measurements. Harmony and equilibrium between data must be created because the information must be standardized between 0 and 1. For normalizing our dataset Eq. (1) was used. Here, x represents the actual data, and Xmin represents the minimum value of the values of the original attribute, and Xmax represents the maximum value of the values of the original attribute.

A. Artificial Neural Network
The versions of the Artificial Neural Network (ANN) are used to solve immense real-world problems. The key benefit of ANN over conventional approaches is that the comprehensive processing steps in the mathematical form of the process formed in it do not need to be complicated in nature [8].

1) Feed forward backpropagation model:
The most popular ANN model in hydrologic modelling is also known as multilayer feed forward neural networks (FFNN) with backpropagation (BP) training algorithm (ASCE Task Committee, 2000b; [22]). There are a set of layers in the FFNN Multilayer, like a sheet of data, one or two secret layers, and a layer of performance. These layers consist of artificial neurons that have contacts with all the neurons found in the next layer. The association between these artificial neurons is measured in terms of the weight meaning represented by the weight interaction. The Input signals spread layer by layer in the forward direction. The weight value relation is calculated by the weight of the input output and its input values and sums the output and transfers the sum product via a nonlinear transfer function to the result. To test relation weights and connection trends, this analysis uses the LM learning algorithm. The LM learning algorithm teaches multilayer FFNN by using the gradient descent technique to reduce errors between the actual output values and the goal values that modify the randomly selected weights of the nodes. To internally evaluate the weight weights, input and output are measured. The teaching loop fails if the errors are reduced or obey a different stopping condition. As a solution to the question of schooling, the entire combination of weights was added. The LM algorithm can be represented in the following equations (1). The feedback from each hidden layer neuron is produced by the input layer until the input vector (x1, x2, , xNI) is supplied to the input layer.
represents the activation function of hidden layer, hj denotes the neuron j output, wji denotes the weight given to input i by neuron j, and bj is the jneuron bias. The performance of the network is given by shows the output layer activation function and NJ denotes the number of neurons in the hidden layer. For all entries, the computation procedure is repeated and produces an output vector y k . The training method entails weight changes using an iterative technique to lower the error between the expected and real performance of the network. Output is compared using the error function with the target output yk. The error is then propagated backwards from the output layer to the input layer to update the weight of each connection as follows: Where m denotes the degree of iteration, εk denotes the output layer's error word, η denotes the learning rate, and δ is a momentum vector that defines the effect on the current direction of travel of past weight shifts.
2) Cascade forward model of back propagation: Cascade forward (CF) models mimic feed forward networks, but from the input to each layer to the successive layers, they have a weight relationship. In its ideal relationship, the ANN easily discovers the additional links needed to reinforce its alliance [26][27][28][29][30][31]. The CFBPNN model also looks like FFBPNN, which uses a weight-updating back-propagation algorithm, but the key symptom of this network is that all previous neuron layers are connected to another layer of neurons. The customized status can be accomplished by Tan-transformation sigmoid" s function. In predicting transformer oil parameters, the neural network cascade has improved efficiency. In this study, the CFBPNN model is used to estimate monthly rainfall.

3) Pattern-recognition Neural network (PatternNet):
The Pattern-recognition Neural Network (PatternNet), a special kind of FFBPNN, is used for pattern recognition and classification issues. Pattern Recognition NN (PatternNet) is a type of ANN that can be used for pattern recognition problems to evaluate the different classes of input data sets. It is essentially a supervised FFBPNN learning method, where binary values representing 1 representing the hereditary class www.ijacsa.thesai.org and 0 representing the hereditary class are the goal data. A specific type of FFBPNN which evaluates pattern recognition and classification problems is known as Pattern-recognition Neural network (PatternNet) belongs to ANN. It has objective to abstract the input data sets of various classes. The Pattern-recognition Neural network (PatternNet) has been trained by FFBPNN based on advanced learning for compressing target data as binary such 1"s as defined inherited class and 0"s as otherwise.

4) Modified ACO-CS based optimized KNN classifier:
There are many numbers of classification algorithm considered and observed for the implementation of rainfall prediction. It is the fact from the observation that neural networks give absolute results when compared with SVM, KNN and tree classifier. The major problem with KNN algorithm is the calculation and assumption of hyperparameter k. In case if k is kept smaller, then the algorithm will be sensitive to outliers, and if the k is chosen to be larger than the neighborhood may include too many points from other classes. Thus, selecting the value of k is the demanding task in using KNN algorithm. Optimum k is selected by ACO. In addition to this, cuckoo search algorithm is used to abandon the worst K in the process. This enhancement reduces the allocation time. By using logarithmic scale of evenly spaced the number of neighbors can be calculated approximately. The error in the classification is estimated to correctly estimate the value of k. Then the design of classifier with the estimated optimal nearest neighbor can be done. This increases the performance of the classifier. The training data comprises of Nv--dimensional pattern.

ALGORITHM
Step 1: Select Number of neighbors by approximately evenly spaced on a logarithmic scale.
Step 2: select k randomly by cuckoo search algorithm Step 3: for selected ant (k)Find classification error Step 4: update pheromone Step 5: update global best Step 4: design KNN classifier with global best (k) Step 5: Train the model Step 6: Test the model Step 7: Validate the existing model The number of nearest neighbors is approximated with the help of evenly spaced log scale. Next the classification error is calculated by ACO with CS algorithm. In this way the optimal nearest neighbor is selected, that gives minimum error. Then the design of KNN is done with optimum number of nearest neighbors. Then the testing and evaluation phase is done.

IV. PROPOSED METHODOLOGY
The Proposed design methodology ensures the optimization of convergence behavior of neural network by implementing a method known as swarm-based intelligence method. The Main objective is to optimize neural network parameters to narrow down convergence by implementing ACO method. The Performance of this system can be improved by combining machine learning approaches with neural network.

A. ACO Algorithm
It emboldens the social actions of ants. There is no vision for world ants, but it has the potential to determine the shortest route between a food supply and their nest by using such chemical materials known as pheromones, which are expelled while passing on identical pathways for a phase of lesser ejection. Initially, step pheromone is initialized to 0. With this skill, actual ants determine the shortest paths where the shortest route leads to a higher rate of ant traversal over time and it will proceed before another shortest path is identified. For time optimization, the Pheromone update is used. Ant treats the number of classes in the training set as Ant.

1) Ant colony optimization metaheuristic approach:
The meta heuristic has the optimization problem with the combination of ant colony algorithm that has been formalized by Dorigo et. al. It has entire link graph such as G (V, E) to build an artificial ACO by traveling across, where V denotes a set of vertices and E denotes set of edges. This solution has been conducted from a set of components in graph built either by denoting components vertices or by edges. The Ant constructs a solution incrementally by moving from one vertex to another along the edge. The ACO meta heuristics is presented in Algorithm 1.

Algorithm 1: The Ant Colony Optimization Meta heuristics
Step1: Set parameters, initialize pheromone trails Step2: while Termination condition does not meet do Step3: Construct Ant solutions.
Step4: apply Local search Step5: Update Pheromones Step6: end while The above algorithm will be combined with machine learning approach.

B. Hybrid Neural Networks
Choosing hidden node is the challenging task in NN. Here Ant colony optimization is used to select the optimized hidden node. Here each ant is initialized with different hidden node. Pheromones are initialized as zero. For each ant fitness is calculated (RMSE) and pheromone is updated, and the heuristic information received. In every iteration globel best also updated. By this optimal hidden node is chosen for proposed model. Feature extraction stage is like existing method which results as an optimized parameter that decides the rainfall. www.ijacsa.thesai.org PROPOSED ALGORITHM Step 1: Data collection Step 2: processing of data Step 4: Develop a proposed model using optimal node Step 5: Train the model Step 6: Test the model Step 7: Validate and compare the proposed model with existing model The algorithm for the proposed methodology is given above. First step in this process is collection of rainfall data. Monthly rainfall of Andhra Pradesh is collected from Indian metrological department. Then the data is processed and missing data in the dataset is detected and imputed using PCA. Several pre-processing is done. To select the optimized node Ant Colony Optimization is used in proposed algorithm. Here each node is considered as ants. For every ant, NN model has been trained with initial parameters. Then the model has been validated. Then fitness function (RMSE) has been evaluated. Pheromone for each ant has been updated. Finally, global best has been updated by comparing ant fitness function. Now the optimal number of hidden nodes has been chosen. Next step is to develop the NN model with optimal number of hidden nodes. Here three type of NN model has been implemented, Feed forward backpropagation model, cascade forward backpropagation model and Pattern-recognition Neural network (Pattern Net). After success of developing the proposed model, it has been trained using training dataset and tested using testing dataset. Finally, the proposed model has been validated and compared with the existing model. Fig. 1 shows the flow chart of the ACO algorithm to determine the number of hidden nodes in ANN.
In this section, new hybrid neural network (HNN) models are introduced by combining ACO with various models of neural networks. The entire HNN forecasting protocol consists of pre-processing and choosing the input vector and maximizing the number of hidden nodes using ACO and ANN modelling. The whole HNN mechanism is seen in the flow map (Fig. 2). Here three proposed algorithms are presented.
In conventional methods the networks work on fully connected network topology, whereas in the proposed method the network nodes are optimized using the ACO algorithm. ACO optimizes the neural network based on the dataset. The ACO with Cascade Feed forward backpropagation increases the optimization process in addition to the nonlinear relation between input and output improvements. Here the incremental search of hidden layers is optimized using the ACO method computationally.
In Pattern-recognition Neural network (PatternNet) the abstraction of the input data sets of various classes is correlated for pattern identification in the rainfall data. The Patternrecognition Neural network (PatternNet) -FFBPNN based on advanced learning for compressing target data to required class through the ACO algorithm.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 7, 2021 526 | P a g e www.ijacsa.thesai.org

A. Data Set Taken
The meteorological data that used in this research has been brought from Indian meteorological Department, based on previous 120 years data set calculation of Monthly Rainfall prediction made in Andhra Pradesh, India.

B. Result and Discussion
The experiments were performed using monthly rainfall data sets that are downloaded from Indian meteorological Department (IMD). Monthly rainfall for years 1901 to 2019 are taken for analysis. Data were preprocessed by filling missing data and normalized by min-max normalization. Then the processed data is given to various classifiers for evaluating its performance. The performance of existing and proposed model is compared.  Table III gives the comparison of existing and proposed model run time. In addition to the improved accuracy our proposed model also increases the speed of the network. Table IV gives the comparison of number hidden nodes used in existing and proposed model. Here optimized hidden node selected for network design is less than existing model. Hence memory required for proposed method is less than the existing method. Fig. 3 shows the chart for existing FFNN and proposed hybrid ACO+FFNN model. Here our proposed model gives good accuracy than existing one. Similarly, Fig. 4 shows the chart for existing CFNN and proposed hybrid ACO+CFNN and Fig. 5 shows the chart for existing patternnet and proposed hybrid ACO+patternnet model.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 7, 2021 527 | P a g e www.ijacsa.thesai.org

VI. CONCLUSION
The paper presents the hybrid methods proposed using ACO and three neural network structures. The Hybrid Methods proposed were ACO+Feed-Forward backpropagation, ACO+ cascade-Forward backpropagation and ACO+ Pattern Recognition NN Classifier. The methods are designed by combining Neural Network and ACO Method. The performance of the existing and proposed models was compared, and the results were presented. It"s been found that the proposed methods are better in performance when compared to the existing Feed-Forward, cascade-Forward and Pattern Recognition NN Classifiers. Real time data from meteorological department were used for testing and verification. Future work needs to focus on more accurate efficient rainfall prediction mechanism.