A New Approach for Time Series Forecasting : Bayesian Enhanced by Fractional Brownian Motion with Application to Rainfall Series

A new predictor algorithm based on Bayesian enhanced approach (BEA) for long-term chaotic time series using artificial neural networks (ANN) is presented. The technique based on stochastic models uses Bayesian inference by means of Fractional Brownian Motion as model data and Beta model as prior information. However, the need of experimental data for specifying and estimating causal models has not changed. Indeed, Bayes method provides another way to incorporate prior knowledge in forecasting models; the simplest representations of prior knowledge in forecasting models are hard to beat in many forecasting situations, either because prior knowledge is insufficient to improve on models or because prior knowledge leads to the conclusion that the situation is stable. This work contributes with long-term time series prediction, to give forecast horizons up to 18 steps ahead. Thus, the forecasted values and validation data are presented by solutions of benchmark chaotic series such as Mackey-Glass, Lorenz, Henon, Logistic, Rössler, Ikeda, Quadratic one-dimensional map series and monthly cumulative rainfall collected from Despeñaderos, Cordoba, Argentina. The computational results are evaluated against several non-linear ANN predictors proposed before on high roughness series that shows a better performance of Bayesian Enhanced approach in long-term forecasting. Keywords—long-term prediction; neural networks; Bayesian inference; Fractional Brownian Motion; Hurst parameter


INTRODUCTION
Forecasting is based on identifying and estimating through observation and in some instances theory, patterns and/or relationships and then extrapolating or interpolating them in order to predict [1].Scientists are different than other forecasters in being well aware of uncertainty, by providing probabilistic forecasts, and constantly searching for enhancements using objective feedback [2].Their overall success rate is improving and their assessment of uncertainty is well calibrated.Finally, weather forecasters have learned that predicting extreme weather events requires different models and skills than those of normal ones [3].
At the same time, they consider such events as an integral part of their job, even though it requires special effort, different models and extra skills to predict them.
The natural phenomena where humans cannot influence their future course, except to a limited extent, do influence with their actions and reactions, changing their future course, making forecasting more difficult but also more challenging [4].The question regarding why simple forecasting models outperform sophisticated ones is still open.The future is never exactly like the past which means that the accuracy of extrapolative predictions cannot be assured.The crucial question is the extent of accuracy, or inaccuracy of such predictions.Most of the time series in rainfall forecast are influenced by random events and often behave not far from random walks, favoring simple methods that are capable of smoothing such randomness.In long-term forecasts, the accuracy of predictions drops while uncertainty increases.On way used by forecaster is reducing the forecasting errors of predictions by averaging more than one model [5].The outcome is not only higher accuracy but also a reduction in the size of forecasting errors, with simple averaging being the best way of combining forecasts [6].The reason is that averaging cancels out the errors of individuals and/or models and in doing so eliminates the noise from the pattern and improve accuracy.Rossi, Allenby, and McCulloch (2006) argued that there are really no other approaches except the Bayesian approach which can provide a unified treatment of inference and decision as well as properly accounting for parameter and model uncertainty.The Bayesian approach allows researchers to cope with complex problems.However, the Bayesian inference provides answers conditional on the observed data rather than based on distribution of test statistics over imaginary samples not observed.Even though the Bayesian approach has decent benefits, it has some trivial costs including formulation of prior, requirement of a likelihood function, and computation of various integrals required in Bayesian paradigm [7].The www.ijacsa.thesai.orgadvancement of computation make complicated integrals become possible.However, choosing an appropriate or objective prior has been an issue in the Bayesian approach [8].Thus, investigators are facing a practical problem with little information in the real-world situations and should not neglect sources of information outside of the current data set [9].
In this article, the major advantage of the proposed BEA technique is that the complexity does not increase with an increasing number of inputs.The solutions can easily be generalized to the problem of uncertain (noisy) inputs, such as Bayesian inference [10] against other generalized approaches [11].Here the filters in comparison are based on non-linear stochastic auto-regressive moving average (NAR) models such as Bayesian approach [12] and Neural-Network Modified [13] [14], implemented by ANN.
The paper is organized as follows: Section II presents the data series as an important case of study, such as well-known chaotic time series.Section III provides the Bayesian Enhanced approach as a method using fractional Brownian motion for obtaining optimal network model.In Section IV, the proposed prediction method is highlighted by showing the performance of the proposed algorithm detailing experimental setup, results and analysis, with Section V providing some discussions and concluding remarks.

A. Overview on fBm
The fractional Brownian motion, which provides a suitable generalization of the Brownian motion, is one of the simplest stochastic processes exhibiting long-range dependence [15] [16].It has been used as a modeling tool.The following demonstrates the stochastic integral representation of fractional Brownian motion [17].The process is as follows, where B(t) is a standard Brownian motion and Γ refers to the gamma function, is a fBm with 0<H<1.The constant 1/Γ(H+1/2) in the following computation is dropped for the sake of simplicity.According to the definition, a fractional Brownian motion (B(H)(t)) t>0 of Hurst parameter H is a continuous and centered Gaussian process with covariance function, Therefore, B(t) is a fBm of Hurst index H.The fBm is divided into three different families corresponding to 0<H<1/2, H=1/2, 1/2<H<1, respectively.The basic feature of fBm is that the span of independence between their increments can be infinite [18].As the Hurst parameter H governs the fractal dimension of the fractional Brownian motion, its regularity and the long-memory behavior of its increments, the estimation of H is an important but difficult task which has led to very vast literature [19].
In this work, the H index is measured by wavelet method [20] [21].

B. Overview on Benchmark Time Series
The standard non-parametric approaches presented in this article are based on stochastic techniques that assume nonlinear relationship among data that reproduce the benchmark chaotic time series and rainfall data only in statistical sense.Although there are many situations when accurate forecasting is impossible, there are many others where predictions can provide useful information to improve our decisions and gain from effective action.Weather forecasts, made several times a day, in hundreds of thousands of locations around the world, are an example, as it is proposed in this work.The rainfall dataset used is from Despeñaderos located at Cordoba, province of Argentina (-31.824703;-64.289692)and the collection date is from year 2000 to 2014 as shown in Fig. 1.The rest of the benchmark time series are presented in the following subsections.

C. The Mackay-Glass Chaotic Time Series
The dataset ensemble is by sampling the Mackay-Glass (MG) equations [22] defined by: with a, b, c, τ setting parameters shown as follows in Table I.

D. The Logistic Chaotic Time Series
The logistic series (LOG) is defined by: ) When a=4, the iterates of (4) perform a chaotic time series [23].

E. The Henon Chaotic Time Series
The Henon chaotic time series can be constructed by following (5), however, it presents many aspects of dynamical behavior of more complicated chaotic systems [24].
When generating data for our experiments, a and b are set as shown in Table III.These same parameters are used in both [25].

F. The Lorenz Chaotic Time Series
Lorenz found three ordinary differential equations which closely approximate a model for thermal convection [26].These equations have also become a popular benchmark for testing non-linear predictors.The Lorenz model is given by the equations ( 4), the data is derived from the Lorenz system, which is given by three time-delay differential systems A typical choice for the parameter values are as a = 10, b = 28, and c = 8/3.In this case, the system is chaotic.The data set is constructed by using four-order Runge-Kutta method with the initial value as is shown in Table IV for LOR01 and LOR03 series.The step size is chosen as 0.01, respectively.These sets of parameters are commonly used in generating the Lorenz system because exhibits deterministic chaos.

G. The Rössler Chaotic Time Series
In this example, the data is derived from the Rössler system [27], which is given by three time-delay differential systems.The data set is constructed by using four-order Runge-Kutta method with the initial value as shown in

H. The Ikeda Chaotic Time Series
Before describing the reconstruction, I introduce the system which will be used to generate most of the time series described herein, namely the Ikeda map [28].The Ikeda map is given as follows: [ cos sin( )]

I. The Quadratic Chaotic Time Series
The quadratic map is defined by the equation If this mapping is iterated by μ=4, starting with a random number in the interval between 0 and 1, then different behavior is obtained dramatically depending upon the initial value of x.Initial values of x which are quite close together can have dramatically different iterates.
This unpredictability or sensitive dependence on initial conditions is a property familiar in displaying chaotic behavior [29] over a range of values for the parameter  including the values chosen here.

III. BAYESIAN ENHANCED APPROACH
This section presents a new method for issuing time series forecasting by focusing on three aspects: the formalization of one-step forecasting problems as supervised learning tasks, the discussion of modeling with Bayes inference techniques as an effective tool for dealing with temporal data and the key of the forecasting strategy when multiple-step-ahead is used for forecasting.
The increasing availability of large amounts of historical data and the need of performing accurate forecasting of future behavior in several scientific and applied domains demands the definition of robust and efficient techniques able to infer from observations the stochastic dependency between past and future.The forecasting domain has historically been influenced by linear statistical methods such as ARIMA models.More recently, machine learning models have drawn attention and have established themselves as serious contenders to classical statistical models in the forecasting community.
In this research, the Bayes assumption is used to update a prior distribution into a posterior distribution by incorporating the information driven as likelihood function from fractional Brownian, provided by neural networks weights from observed data in order to generate point and interval forecasts by combining all the information and sources of uncertainty into a predictive distribution for the future values.
A weight vector w defines a mapping from an input vector x to a predicted output vector ŷ given by ˆ( , ).
y f x w  Assuming a fractional Brownian model, the conditional probability distribution for the output given the input vector l x is a follows:   The application of the regression problem involving the correspond neural network function y(x,w) and the data set consisting of N pairs, input vector l x and targets t n (n=1,….,N).
To complete the Bayesian enhanced approach for this work [30], prior information for the network is required.The beta distribution is chosen for this purpose.The Beta density function is a very versatile way to represent outcomes like proportions or probabilities.It is defined on the continuum between 0 and 1.There are two parameters α and β which work together to determine if the distribution has a mode in the interior of the unit interval and whether it is symmetrical.This is a probability model which describes the knowledge gained after observing a set of data.It is proposed to use fractional Brownian, where H is the Hurst parameter, assuming that the expected scale of the weights is given by w set by hand.The Beta prior distribution for H is The full probability model is derived from the product,   Then, the posterior distribution is as follows, Metropolis-Hasting algorithm was used for computation with a starting value of 0.1.The number of iterations was set to be 10,000.Monte Carlo Error was used to examine the convergence.This was carried out considering that the network function f(x n +1,w) is approximately linear with respect to w in the vicinity of this mode, in fact, the predictive distribution for y n+1 will be another multivariate Gaussian.This was carried out considering that the network function f(x n +1,w) is approximately linear with respect to w in the vicinity of this mode, in fact, the predictive distribution for y n+1 will be another multivariate Gaussian.

IV. PREDICTION RESULTS
The simulation results in different order approximations and time periods are presented in the following Table VIII.The performance of the comparison is measured by the Symmetric Mean Absolute Percent Error (SMAPE) and Root Mean Square Error (RMSE) proposed in the most of metric evaluations [31], defined by where t is the observation time, n is the size of the test set, s is each time series, X t and F t are the actual and the forecasted time series values at time t respectively.The SMAPE and RMSE of each series s calculate the error in percent between the actual X t and its corresponding forecast value F t , across all observations t of the test set of size n for each time series s.The Monte Carlo method was used to forecast the next 18 values from benchmark chaotic and rainfall time series.Such outcomes are shown from Fig. 3 up to Fig. 8. Here, previous algorithms are used [10] [11] [13] [14] to compare the Bayesian enhanced approach.
Comparisons are preformed between the Bayesian and NAR models by using long-term time series; in this case 15 years of monthly rainfall data (2000-2014) served as the historical data to forecast 2015 and benchmark of chaotic time series proposed in the literature.The results in Table VIII show that Bayesian enhanced approach (BEA) were a bit superior for the lengthy time series, with a SMAPE and RMSE about one half that of the Bayesian approach (BA) and neural networkmodified predictor filter (NN-Mod).
The results show that the performances of the BEA with BA and NN-Mod are better than those in term of SMAPE and RMSE, due to the existence of outliers.With this lengthy series, BEA could adequately detect the underlying the relationship among those correlated variables.The simulation results of the BEA methods are compared with the BA and NN-Mod summarized in Table VIII for the benchmark time series.The similarity of the trend of the prediction performance between them is clear, BEMA is slightly better particularly on rainfall time series with reference to BEA and BA approaches.Although the comparison was performed on ANN-based filters, the experimental results confirm that the enhanced Bayesian method can predict chaotic time series more effectively in terms of SMAPE and RMSE indices when is compared with other existing forecasting methods in the literature.However, the wish to preserve the stochastic dependencies constrains all the horizons to be forecasted with the same model structure.Since this constraint could reduce the flexibility of the forecasting approach, a variant of BEA approach is still open.Fig. 10 and Fig. 11 shows the evolution of the SMAPE and RMSE indices for BEA, BA and NN-Mod filter, which use the H parameter to adjust heuristically either structure of the net or parameters of the learning rule.A new approach for time series forecasting: Bayesian enhanced by fractional Brownian motion with application to rainfall series is presented.Building effective predictors form historical data demands computational and statistical methods for inferring dependencies between past and long-term future values of observed values as well as appropriate strategies to deal with longer horizons [32].This work showed and discussed the BEA supervised learning technique to deal with long-term forecasting problems.In particular the fBm model assumption is stressed in Bayes inference by local learning approximators in dealing with important issued in forecasting, like nonlinearity and nonstationarity.The main results show a good performance in term of SMAPE and RMSE indices of the predictor system based on Bayesian enhanced approach, particularly on rainfall time series from a geographical observation point, such as Despeñaderos, Cordoba, Argentina.
Future research should be concerned with the extension of these techniques to some recent directions in big data and the application to spatiotemporal tasks.These results encouraged us to continue working on new machine learning algorithms using novel forecasting methods.

Fig. 9 .
Fig. 9.The SMAPE index applied over the 15 time series

TABLE I .
PARAMETERS TO GENERATE MG TIME SERIES

TABLE II .
PARAMETERS TO GENERATE LOG TIME SERIES

TABLE III .
PARAMETERS TO GENERATE HEN TIME SERIES

TABLE IV .
PARAMETERS TO GENERATE ROS TIME SERIES Table V, and the step size is chosen as 0.01, respectively.

TABLE VI .
PARAMETERS TO GENERATE IK TIME SERIES

TABLE VII .
PARAMETERS TO GENERATE QUA TIME SERIES for the selected time series, the first 102 values are used for training and the remaining 18 values are kept for validation and test data.The long-term behavior changes thoroughly by changing the initial conditions to obtain the stochastic dependence of the deterministic time series according to its roughness assessed by the H parameter.Then, extra 18 testing data are used to measure the prediction performance.www.ijacsa.thesai.org

TABLE VIII .
RESULTS OF THE FORECASTING APPROCHESThis paper reports the results of a comparison of three different forecasting techniques for a class of high roughness long-term time series forecasting.The series were selected regarding the long or short term stochastic dependence of the time series assessed by the Hurst parameter H to give a forecast horizon of 18.The rainfall forecasts obtained between those algorithms are compared with NAR ANN predictor, namely BA and NN-Mod, for a case study on southwestern province of Cordoba.The study analyzed and compared the relative advantages and limitations of each time-series predictor filter technique, used for issuing rainfall and chaotic time series prediction.The discussion of how feed-forward networks can successfully approximate the quantitative changes in the dynamics of the time series data due to changes in the parameter values of the exogenous variables remains open for study, mainly.