Ensemble Learning for Rainfall Prediction

Climate change research is a discipline that analyses the varying weather patterns for a particular period of time. Rainfall forecasting is the task of predicting particular future rainfall amount based on the measured information from the past, including wind, humidity, temperature, and so on. Rainfall forecasting has recently been the subject of several machine learning (ML) techniques with differing degrees of both short-term and also long-term prediction performance. Although several ML methods have been suggested to improve rainfall forecasting, the task of appropriate selection of technique for specific rainfall durations is still not clearly defined. Therefore, this study proposes an ensemble learning to uplift the effectiveness of rainfall prediction. Ensemble learning as an approach that combines multiple ML multiple rainfall prediction classifiers, which include Naïve Bayes, Decision Tree, Support Vector Machine, Random Forest and Neural Network based on Malaysian data. More specifically, this study explores three algebraic combiners: average probability, maximum probability, and majority voting. An analysis of our results shows that the fused ML classifiers based on majority voting are particularly effective in boosting the performance of rainfall prediction compared to individual classification. Keywords—Ensemble learning; classification; rainfall prediction; machine learning


I. INTRODUCTION
Time-series forecasting has recently gained the research interest and has been explored in multiple domains like stock market, finance and climate change studies. Time-series forecasting refers to the analysis process of a sequence of data points containing successive measurements that are made within a specific time interval. The domains mentioned above are currently heavily reliant on time-series data [1][2][3] Climate change research is one of the domains that utilizes time-series forecasting to analyse the varying weather patterns statistically for a particular period [4]. The nature of climate change data representation across time is the key characteristic of climate change [5].
Weather forecasting is a subset of climate change research that predicts the atmosphere's state at a future time and location [4]. An important application of weather forecasting is rainfall prediction, which is heavily used in various large-scale activities such as food production planning, water resource management and others that rely on water. It is therefore crucial to ensure that rainfall predictions can be further improved, especially with respect to their accuracy and predictive performance, so that the proper preparation and planning of large-scale activities can be worked out beforehand.
Machine learning has made dramatic improvements and is a core sub-area of artificial intelligence. It also enables computers to discover themselves without being explicitly programmed. A set of machine learning algorithms can be used to obtain meaningful insights into the data that help make effective detection on phishing websites. However, it is still very far from reaching human performance. The machine still needs human assistance to predefine the algorithms on initialization. Several machine learning approaches for rainfall forecasting have been studied for various locations such as South Africa, China and other countries [6][7][8][9]. The classifiers that are used for rainfall prediction include the Naïve Bayes, decision tree, support vector machine, neural networks, random forest, genetic algorithm, support vector regression, M5 rules, radial basis neural networks, M5 model trees, and k-nearest neighbours [10][11][12][13][14].
This paper highlights the phishing webpage detection mechanism based on machine learning classification techniques. The rest of the paper is organized in the following manner: Section II presents the rainfall prediction methodology, Section III presents the utilization of machine learning classification techniques, Section IV presents the utilization of ensemble machine learning techniques, and Section V presents the experimental results gained after the implementation of the ensemble classification methods in the rainfall datasets.

II. METHODOLOGY
Machine learning is one of the most exciting recent technologies. Machine learning had been positioned to address the shortages of human cognition as well as information processing, specifically in handling large data, their relations and the following analysis [15][16]. In general, machine learning studies the research and algorithms construction that can learn from, and derive predictions about data [17][18]. Therefore, the machine learning approach is selected to predict the rainfall. values) and normalizing the data to limit the values to specific ranges. The pre-processed data are then used in the third phase to comparatively analyse the five ML techniques to identify the best technique from the five ML classifiers that are noted above. The fourth and final phase focuses on configuring the ensemble method to carry out assessment on the performance of the entire algorithm. Each of these four phases is described further in the subsections that follow.

A. Dataset
The dataset was obtained from the Drainage and Irrigation Department, and the Malaysian Meteorological Department. The dataset consists of 1,581 instances and was organized into two classes. The first is the 'active rainfall' class, containing 428 instances, and the remaining instances are grouped as 'no rainfall'.
The obtained data on description and location are illustrated in Table I. The features displayed in the dataset include the relative humidity, rainfall, temperature, flow, and water level. The feature details are as described in Table II. Table III provides the detailed measurements for each feature. B. Pre-processing As noted above, the pre-processing phase ensures that the available data are prepared for further processing in subsequent phases. Here, raw data are usually negatively impacted by noise or incomplete information. The pre-processing phase is a crucial stage in enhancing the improvement of the prediction process by ensuring the data are regularized and filtered beforehand [15], [19]. Therefore, we applied two rather common pre-processing subtasks: cleaning and normalization. In this study, Waikato Environment for Knowledge Analysis (Weka) is used as a tool to perform the pre-processing task. Weka is java-based machine learning software that is developed by the University of Waikato, New Zealand, and it has various types of machine learning algorithms and operates on an open source license. It also provides various visualization tools for data analysis as well as predictive modelling.

C. Cleaning
In the cleaning task, the data obtained are found to contain missing values, which are represented by characters such as '?'and '*'. In fact, such missing values can cause errors in the prediction process. Therefore, these missing values must be addressed. Table IV illustrates a sample of data containing missing values. A mean average mechanism is then used to populate the missing values. The mean average functions are obtained by summing all instances of an attribute that is selected and then dividing the sum by the number of records. in the second attribute (humidity), for example, the missing values are filled by firstly adding all instances (87.6, 88.9, 84.7, 85.2, 88.3, and 84.2), and then dividing the results by the total number of instances, which in this case is 6. Table V shows the mean average for each attribute.

D. Normalization
In the normalization task, values are limited within a specific interval, in which the interval facilitates the prediction since the values are reduced into specific ranges. Normalization is crucial for particular algorithms like ANN and SVM. Table VI illustrates values prior to normalization. As shown in Table VI, the values are found to vary greatly, although these values are seen to decompose around the 20s and 80s for the first two features and around the 10s for the remaining three features. To unify these values, we chose an interval range of -1 to +1 and use the normalization mechanism that was introduced by [20] as defined in (1): In Eq. (2.1), x is the data that requires to be normalized. X min is the minimum value for all data, and X max refers to the maximum value of all the input data. On the other hand, Y is the normalized data, while Y min is the ideal minimum value. Y max refers to the desired maximum value. Following the normalization task, all values for the five features are converted to be in the range of -1 to +1. As illustrated in Table VII, the data have been normalized to prepare for further processing.

E. Evaluation Metrics
For the purpose of evaluating the method proposed, the common information retrieval metrics are employed. The evaluation is carried out through the use of the common information retrieval metrics of recall, F-measure, and precision. Our model predicts 2 classes (rain or not), so sensitivity or recall can reflect the ratio of rain and no-rain correctly identified by the model. The R2, SSE, and MSE are better for continuous values, while our model does not predict such an output. Precision evaluates the true positives (TP) that are classified correctly and the false positives (FP) that are entities classified incorrectly, which could be computed using (2): The recall parameter is used in assessing the true positives (TP) with respect to the false negatives (FN), which are unclassified entities. This evaluation is calculated as shown in (3): Lastly, the average of the recall and precision, which is the F-Measure is computed as follows:

III. MACHINE LEARNING MODELS
Numerous learning methods are selected in this study to benchmark the rainfall prediction performance. These are NB, C4.5, SVM, ANN, and RF, which are all supervised learning methods. A notable aspect of the supervised machine learning methods is that they select suitable methods together with parameters and features that are deemed suitable [21][22][23][24][25]. Two main experiments were carried out in order to evaluate the performance of the classifiers. The first identifies the best parameterization set of each classification model to be employed, since the model has a few alternatives as well as options, which would affect the method's success. Different tuning parameters are used to tune every classifier in order to yield highly accurate results. A series of experiments were carried out to obtain the optimal values of each classifier. The performances between the five classifiers are then evaluated and compared. The second experiment analyse the true performance of the classifiers for rainfall prediction.

A. C4.5 Algorithm
In this section, the J48 decision tree, which is included in Weka, is formed based on the C4.5 decision algorithm. C4.5 is one of the most effective classification methods [26]. Table VIII shows the pseudo code of the algorithm. A decision tree is generated by C4.5 in which every node splits the classes with reference to the information. Splitting criteria is selected based on the attribute having the highest normalized information gain. For example, our dataset contains temperature, humidity, rainfall, river flow, and water level. The C4.5 techniques first explores these features to determine which feature is the best for splitting data (a feature with high information). The feature is then used to split the data into the next feature until it reaches the last destination. The evaluation was performed using confidence factor, MinNumObj, and Numfolds parameters. The splitting mechanism splits the dataset into 60% for training and 40% for testing, and the evaluation was performed with the use of the common information retrieval metrics of the recall, precision, and F-Measure. Table IX illustrates the algorithm results. As shown in Table IX, several parameter values are used. The best results are achieved when the parameters are (confidence factor = 0.5, MinNumObj = 4, and Numfolds = 5), which results in a precision of 71.3%, a recall of 74.2% and 72.7% F-Measure.

B. Naïve Bayes
Naïve Bayes is classified as supervised machine learning method that belongs to the probabilistic classifiers family which applies Bayes theory to the independence assumption between features [27]. As a matter of fact, Naïve Bayes identifies the probability of every feature by calculating the assumptions. Table X depicts the pseudo code of the Naïve Bayes algorithm.
This section evaluates the Naïve Bayes technique being applied using Weka. For every known class value, NB computes every attribute conditional probability on the class value. Later, it obtains the joint conditional probability for the attributes using the product rule. This process is followed by the use of Bayes rule to obtain the class variable's conditional probabilities. After completing this process for each class value, the class having the highest probability is reported.
The parameter tuning experiment was carried out in identifying the best parameters from a few different options available. There are two parameters that affect the performance of the NB classifier: debug and use Kernel Estimator. In this study, debug and use Kernel Estimator are tested on two different values (True and False) for choosing the optimal parameter of the NB classifier. When use Kernel Estimator=True, it means the NB model employs a kernel estimator for numeric attributes as opposed to a normal distribution. Moreover, if the debug parameter is set to False, it means that the classifier may not output any extra information to the console. As shown in Table XI, the best outcomes are attained when the parameter (debug = False, use Kernel Estimator = True) obtains 65.5% precision, 71.5% recall and 65.5% F-Measure.

C. Support Vector Machine
This section discusses the evaluation of the support vector machine method by using the libSVM package in Weka. Some parameters have to be fitted to the data to avoid errors due to the SVM being very sensitive to the presence of any inappropriate parameters. The support vector machine is a method that divides data into two sections with the use of a hyperplane 12 . This division process independently addresses every class label, and this could be carried out through classifying the data into class x and not class x, and then further categorizing the data into class y and not class y, where x and y are the two class labels. The classification is carried out by calculating the distance between every data point and the hyperplane's margin.  The SVM algorithm uses a kernel that is a set of mathematical functions to allow for data classification in a higher dimensional space when such data could not be linearly separated in a lower dimensional space. Various kernel functions are available to govern the above, like linear, polynomial, nonlinear, radial basis function (RBF) and sigmoid. The SVM can be further classified into two categories, namely, the C SVM and the nu SVM. C and nu refer to the regularization of the parameters that aids in implementing a penalty on the misclassifications that happen when the classes are separated. C ranges from 0 to infinity and nu is always between 0-1.
The parameter evaluation was performed using the SVM type and kernel type parameters. The default parameter value for the SVM type is set as nu-SVC, which uses a range between 0 and 1 to represent the lower and upper bounds of the number of examples that are support vectors which lie at the wrong side of the hyperplane. Moreover, a number of different parameterization combinations are tested under the kernel type. Here, each kernel type parameter is changed one at a given time while SVM type is remained the same consistently so that any differences due to the kernel type parameter change can be recorded. Four techniques have been established in the literature already, namely, the linear, radial basis function (RBF), polynomial, and sigmoid, and these four are extensively tested in our experiments. Table XIII illustrates the results of the algorithm. The best results are reported when the parameters are (SVM type = nu-SVC, kernel type=RBF), where the precision is reported to be 71.1%, the recall is 72.8% and the F-Measure is 69.1%.

D. Neural Networks
Neural networks were originally motivated by modelling machines that replicate the brain's functionality. Every neural unit is linked to many others. Links could either be inhibitory or enforcing in nature, with regards to their activation state effect of the connected neural units. Every individual neural unit could have a summation function combining its input values [28]. This algorithm is used in regression, classification, prediction and clustering [28]. Table XIV depicts the pseudo code of the algorithm.
There are two parameters that significantly affect the neural network performance classifiers: the number of hidden layers and the value of learning rate. To get the optimal hidden layer value, a range of values are tested from 2 to 10 (at an increment of 2), and the learning rate is tested on five different values from 0.02 to 0.10 (at an increment of 0.02). As shown in Table XV, the best outcomes are attained when the parameters (learning rate = 0.02 and the hidden layer =4) obtain 72.7% precision, 74.5% recall and 73.2% F-Measure. Therefore, the optimal parameters of the ANN are set as the learning rate = 0.02 and the hidden layer=4. The parameter evaluation was performed with 100 iterations.

E. Random Forest
Random forest is a method employed for many purposes, including regression, classification, and prediction. This method is an ensemble of decision trees aiming to construct, within the training data, a multitude of decision trees and generate the class as the output. The random forest classifier is tuned using the maximum depth of the tree (Max Depth) and the number of features to randomly investigate (Num Features) and the number of tree (Num Tree) parameters. The Max Depth, Num Feature and Num Tree is tested on three different values which are (1, 5 and 10), (0, 3 and 5) and (10, 12 and 15), respectively. Experimental outcomes reveal the classification performance of the RF classifier is increased when the depth, the number of features and the number of tree increase. The obtained parameter tuning result is reported in  For all weights (j, i) 7.
Select m a random predictor variable 4.
Split the node 6.
End for 8.
Repeat for all nodes

F. The Performance Evaluation of Different Classifiers
Extensive parameterization tests were performed in quantifying each parameter's influence for the optimization of the classification models. Numerous parameters crucial to maximizing the model's performance are selected from the tests. Whereas, other parameters are classified as less sensitive. Based on the finalized parameters, the classification models are properly executed to quantify their performance on rainfall prediction. Five machine learning methods are identified in the study. They are the Naïve Bayes (NB), C4.5, neural network (ANN), support vector machine (SVM), and random forest (RF). This phase quantifies the performance of the mentioned machine learning methods and determines the best overall methods for rainfall prediction. From the observations during the predictive studies, the results revealed the most effective classification to be the ANN and that the NB yielded the weakest result. Hence, the ANN this time is benchmarked against the ensemble method. Table XVIII shows the comparison of the five classifiers. The performance levels of ML-based predictions vary between the studies, although a neural network classification technique has a slight performance advantage compared to other classifiers. With individual ML classifier techniques for rainfall predictions already extensively documented, the fusion of various ML classification techniques based on an ensemble methodology presents an opportunity to tap any possibility to improve prediction performance. Apart from that, the varying performance levels of such techniques create space for improvement through the combination of various methods or improving techniques.

IV. ENSEMBLE METHODS FOR MODEL PREDICTION
The combination of multiple classifiers that result in one subsequent model is known as the ensemble model. Recently, ensemble techniques have been increasingly utilized to improve the prediction performance of classification tasks [29]. In general, there are three common issues faced in most single classification techniques that can be improved when using multiple classifier instances.

1) Statistical reasoning:
In the event the training data amount is not sufficient, a learning algorithm extracts a weak hypothesis. The combinations of many classifiers, however, have a tendency to find a stronger hypothesis.
2) Computational reasoning: An appropriate hypothesis for an individual classifier (such as neural networks) may be more difficult as well as time consuming. Combining multiple classifiers (experts) with an appropriate parameterization (considering speed, efficiency and accuracy) may provide a better hypothesis while reducing the computation time through the enforcement of each classifier's strength in this case.
3) Representational reasoning: An individual classifier at times could not represent true hypothesis in the hypothesis space. In the case of ensemble methods, the formation of weighted sum of the hypotheses from the hypothesis space expands the hypothesis space in providing a hypothesis that is more presentable [29].
Ensemble methodology basically works by weighting numerous individual classifiers and later combining them in order to obtain a new classifier that theoretically outperforms their individual instances. The use of different classifiers from various different learning algorithms is an effective method in addressing the diversity among classifiers since it has the potential in minimizing errors or increasing the prediction performance by basing on diverse approaches [29]. The purpose of ensemble methodology is therefore to build a predictive model through the integration of multiple models.
Various fields of study have reported successful outcomes from the use of the ensemble method, such as in healthcare, information retrieval and statistics. Research of the ensemble method has greatly increased from the 1990s onwards. In order to improve the single model's predictive performance. Based on the suggestion [30], an ensemble of neural networks configured similarly. Reference [31] laid out the foundations for the award-winning AdaBoost. Reference [31] and [32] algorithm revealed that through the combination of weak classifiers, a strong classifier in the probably approximately correct (PAC) sense can be generated. Reference [33] proposed a novel ensemble health care decision support method to assist an intelligent health monitoring system that utilizes a Meta classifier voting system made up of three base classifiers, which are the C4.5, random forest and random tree algorithms [32]. Reference [32] employed the ensemble neural network for breast cancer diagnosis, where the researchers combined several neural network outputs that are fused to construct an ensemble output using the simple averaging algorithm. From the study, they found that the ensemble neural network improved the generalization with less false positive malignant diagnoses while accelerating the learning process. Furthermore, [34] adapted multiple neural networks as a method to improve the robustness of predictions. They used several methods such as linear combinations and stacked generalizations to combine member networks. From the study, they found that two combination methods, i.e., selective combination and network combination with various structures, are the best performers that greatly improved model performance.
Rainfall forecasting gained the attention of many researchers' as it has interesting challenges represented by the complexity that lies beneath predicting specific factors that are linked to rainfall like wind and humidity [21][22][23], [35][36]. Current techniques for rainfall prediction utilize only individual classifiers such as neural networks [21], the knearest neighbours [22], support vector machine [22] and others. Based on encouraging results from the ensemble methods application in various fields, the ensemble classification technique is applied to rainfall prediction by leveraging three linear algebraic combiners: majority voting, average probability, and maximum probability. Since there have not been any applications of the ensemble method for rainfall forecasting in Malaysia, this research should serve in addressing numerous supervised learning methods for the above case.
To assess the performance of our ensemble classification for rainfall prediction, we first fused all five machine learning models (i.e., the Naïve Bayes, C4.5, neural network, support vector machine, and random forest). Their combinations are based on three linear algebraic combiners, which are majority voting, average probability, and maximum probability. Equation (7), (8) and (9) describe their mathematical derivations. Voting is essentially the general blueprint for combining classifiers into ensembles. Voting schemes are divided into unweighted as well as weighted voting schemes. Unweighted schemes are including maximum probability, minimum probability, the product of probabilities, majority voting, and the average of probabilities. Whereas, weighted schemes encompass simple weighted voting, best-worst weighted voting, rescaled weighted voting, and quadratic bestworst weighted voting [32][33][34].
This study only focuses on unweighted voting schemes. In principal, the k base classifier's binary outputs are being combined such that the output of the ensemble is chosen based on the highest number of votes. Any of the unweighted schemes are used to guide the classifiers. Equation (5) highlights the basic form of the classification ensemble calculation.
Here, i = 1..., n is the number of classes and J = 1...,m is the number of classifiers that are contained in ensemble method.
LCi(X) is thus represented as any combination scheme in determining the final output of the classifier ensemble.

A. Majority Voting
A classification of unlabelled instances is performed in this combining scheme based on the class having the highest (the most frequent) number of votes. This method is employed most of the time as a combination technique to compare newly proposed techniques [37]. The majority voting is defined by (6) as follows:

B. Average of Probabilities
Every classifier output a probability distribution vector over all classes that are relevant in the probabilistic approach, as shown by (7). The individual probability values are averaged (or summed) by all classifiers for every class, and the class that yields the maximum value is chosen at last [32].

C. Maximum Probabilities
The maximum probability approach is almost identical to the average probability approach described above [32]. Here, the only difference is in the selection of a probability with the maximum value, as highlighted in (8) below: V. RESULTS AND DISCUSSION Table XVIII shows the comparison of the five classifiers using the three metrics based on the test dataset. From the table, the neural network outperforms the other techniques with a precision of 72.7%, a recall of 74.5%, and an F-Measure of 73.2%. The predictive results obtained from the neural network will be compared to the ensemble rainfall prediction approach.
We further discuss the experimental results that are obtained by applying the ensemble method with the three unweighted combiners (majority voting, average probability, and maximum probability) for rainfall prediction. This combination can be used to combine any of the five classifiers (Naïve Bayes, C4.5, support vector machine, neural networks, and random forest). The ensemble model works by combining classifiers from both groups of 'weak' and 'strong' classifiers, thereby forming an ensemble. Thus, in ensemble terms, the classifiers are weak learners, while the ensemble model is a strong learner. The evaluation of the ensemble methods is performed by the use of common information retrieval metrics as follow: recall, precision and F-Measure. The outcomes are based on a similar splitting mechanism for the dataset of 60% as training data and 40% as testing data.
Tables XIX, XX and XXI highlight the test data results from multiple combinations and the three ensemble methods based on the selected metrics. The single ANN classifier is also included in the comparison for benchmark purposes. For the precision metric, Table XIX demonstrates that the combination of the SVM, C4.5 and ANN methods via the majority voting scheme yielded the highest precision accuracy at 76%. This result is followed closely by the same combined ML tools but with the average probability scheme close to 75% accuracy. There is generally a 2% to 3% increase in precision accuracy if the best ensemble methods are to be compared to the single ANN classifier. On the other spectrum, the full or 4 ML tool combination that was tested resulted in disappointing precision accuracy for most of the weighting schemes, and was far below the 73% threshold that was posed by the single ANN classifier, except for the full combined ML tools with the maximum probability ensemble scheme, which scored an unusual 71% accuracy for such combinations. Combination of (ANN, NB, C4.5, and RF) 53% 53% 53% Combination of (SVM, C4.5, and ANN) 76% 75% 71% Combination of (SVM, C4.5, and NB) 74% 73% 71% Combination of (NB and ANN) 70% 67% 67% Single classifier (ANN) 73% For the recall accuracy, Table XX highlights a similar pattern to the precision metric, whereby the combination of the SVM, C4.5 and ANN tools and the use of the majority voting and average probability schemes scored the highest recall accuracy at 77%. This is a 2% increase from the single ANN classifier, which was 75%. Except for the two combination or ensemble schemes that were mentioned above, the remaining combination or ensemble schemes all performed slightly worse than the ANN classifier. Table XXI highlights the F-Measure accuracy for the employed classifiers and combiners. Again, the same combination of the SVM, C4.5 and ANN tools based on both the majority voting and average probability ensemble schemes yield the best accuracy as compared to the other classifiers. It is noted that the remaining fusion classifiers and ensemble schemes scored well below the 73% of the ANN classifier.
From the experiments, the fusion of the classifiers was shown to generally boost prediction diversity without compromising the individual prediction strengths of the individual classifiers. However, care needs to be considered, as not every fusion strategy works at improving performance over single classifiers. This is demonstrated by the fact that in all 3 performance metrics tests, the fusion of the 4 classifiers and all classifiers degrade the performance accuracy regardless of the ensemble scheme that is chosen. On the other hand, the selected fusion classifiers based on the majority voting scheme are superior to the single classifiers. Particularly, a combination of three tools with the minimum presence of the SVM and C4.5 algorithms ensure that superior performance can be achieved. Table XXII illustrates the confusion matrix for a two-class classifier (i.e., Rain and No Rain). The matrix is a summary of the prediction results that are obtained from the best ensemble method (i.e., majority voting) for the rainfall classification problem on the test dataset (i.e., 632 instances). In the context of our study, the entries in the confusion matrix carry the following meaning: true positive (TP) indicates the number of instances that correctly predict that it will rain, which is equivalent to 438 days; true negative (TN) shows the number of instances that correctly predict that it will not rain, which is equivalent to 39 days; and false positive (FP) shows the number of instances that incorrectly predict that it will rain, which is equivalent to 133 days. FP is also known as false positive predictions. Finally, false negative (FN) indicates the number of instances that incorrectly predict that it will not rain, which equivalent to 22 days and is otherwise known as false negative predictions.

Maximum probability
Combination of (SVM, ANN, NB, C4.5, and RF) 61% 61% 69% Combination of (ANN, NB, C4.5, and RF) 61% 61% 61% Combination of (SVM, C4.5, and ANN) 76% 75% 69% Combination of (SVM, C4.5, and NB) 63% 73% 69% Combination of (NB and ANN) 70% 68% 68% Single classifier (ANN) 73% Rainfall forecasting is a process to predict potential rainy locations by considering numerous factors like humidity, wind speed, level of water, and temperature. The common methods employed in rainfall forecasting are supervised machine learning methods in which predefined example data are first trained and then followed by the prediction using the testing data. The key challenge of these methods is in identifying suitable mechanisms, the sensitivity of the objective functions, and the dependency on treating features. These differences lead to inconsistent performances, making the selection process of a suitable method for rainfall prediction a task that is challenging. This paper proposes an improved method to develop long-term (i.e., monthly) and short-term (i.e., daily) weather forecasting models for rainfall predictions using the ensemble technique. Therefore, this paper proposes an improved method to develop long-term (i.e., monthly) and short-term (i.e., daily) ensemble weather forecasting models for rainfall predictions by using three linear algebraic combiners (i.e., majority voting, average probability and maximum probability) for combining five rainfall prediction models (i.e., the Naïve Bayes, C4.5, neural network, support vector machine, and random forest). By leveraging daily meteorological data in Selangor, Malaysia, over a period of 6 years (from 2010 to 2015), 1581 instances were obtained and organized into two classes. The first is the 'active rainfall' class containing 428 instances, while the remaining instances are grouped as 'no rainfall'. We have experimented a group of base algorithm models including the NB, SVM, ANN, RF and C4.5. From the analysis, all five ML techniques that were mentioned are shown to perform very well, although the ANN technique in particular is generally found to perform the best, while the NB technique is relatively the weakest. The study further explored the ensemble's potential for further upperbound improvements in the rainfall prediction model. The ensemble experiment analysis revealed that the rainfall prediction is indeed enhanced using the ensemble method. Particularly, the ensemble method based on majority voting is shown to provide better predictive performance with high precision, recall, and F-Measures compared to other experimented algebraic combiners. Overall, the combiners have been demonstrated to be superior to single classification methods. Such results complement previous findings on ML methods in rainfall prediction and hence, our recommendation is to use ensemble ML algorithms as an effective approach for the above. It is hoped that the outcomes of this study may help to find suitable machine learning techniques that improve the performance of rainfall forecasting predictions.