Hybrid Forecasting Scheme for Financial Time-Series Data using Neural Network and Statistical Methods

Currently, predicting time series utilizes as interesting research area for temporal mining aspects. Financial Time Series (FTS) delineated as one of the most challenging tasks, due to data characteristics is devoid of linearity, stationary, noisy, high degree of uncertainty and hidden relations. Several singles’ models proposed using both statistical and data mining approaches powerless to deal with these issues. The main objective of this study is to propose a hybrid model, using additive and linear regression methods to combine linear and non-linear models. However, three models are investigated, namely, ARIMA, EXP, and ANN. Firstly, those models are feeding by exchange rate data set (SDG-EURO). Then, the arithmetical outcome of each model is examined as benchmark models and set of aforementioned hybrid models in related literature. Results showed the superiority in hybrid model on all other investigated models based on 0.82% MAPE error’s measure for accuracy. Based on the results of this study, we can conclude that further experiments desirable to estimate the weights for accurate combination method and more models essential to be surveyed in the areas of series prediction. Keywords—Financial time series; hybrid model; additive combination; regression combination; exchange rate


I. INTRODUCTION
The financial domain is the most utilized environment for economic research aspects, making financial safety and security an important concern [1].Currency exchange rate is outlined as the rate on foreign currency and demonstrates the foreign-currency price of the currency of the country within which value is calculated [2], [3].In trendy FTS, predicting, exchange ate have been recognized as one of the most difficult applications [4].Thus, several numbers of models are designed to support the stakeholders for intelligence precise predictions.
In addition, the researchers proposed various conventional prediction models.even so, traditional statistical models such as ARIMA, ARFIMA [5] ARMA, ARCH, GARCH, EXP, and AR those models unable to capture the complexness and behavior of the exchange rate [6].Many researchers have introduced a lot of advanced nonlinear techniques as machine learning, including Artificial Neural Network (ANN) models [7], SVM algorithm, SVR methods [8], and data mining model like KNN algorithm [9].Exponential model (EXP) is linear types used to predict the characteristics of linear time series, applied in administration and finance prediction substantially [10], [11].But some kinds of series included linear and nonlinear.The EXP model depends on previous periods observed, one major drawback of this approach is that unable to predict the characteristics of nonlinear time series, and often inefficient linear model in the prediction of complex data.Accordingly, it is necessary to reconsider non-linear models to fill the limitation of EXP model [12].
ARIMA model used successfully in forecasting time series analysis, a linear approach reached by scientists Jenkins and Box.However, the most disadvantage of this method is that less efficient in fitting within the field of complicated and nonlinear time series.
Recently, intelligence models of ANN known for its propensity to identify the non-linear characteristics present within the time series data specially in FTS forecasting [7], [13].ANN applied in multiple layered to predict exchange rate which reached sensible results.However, the disadvantages of applying this model within the case of complicated time series, these series wherever linear model and nonlinear model at constant time [7], [14].Based on that, it's not acceptable to use non-linear models to predict the complicated time series because these models might not consider the linear qualities existing in time series [15].
Finally, we can summarize that problem of FTS due to inherent characteristics as non-linear, non-stationary, noisy, high degree of uncertainty and hidden relationships.However, single machine learning models and conventional statistical techniques failed to capture its non-stationary property and accurately describe its moving tendency.Thus, many hybrid models design and new algorithms are developed within the literature to improve the influence of noise and enhance the prediction performance.

II. RELATED WORKS
In this section, review the main direction of recent aspects that explaining the forecasting time series problems.Firstly, this study motivates by the more general evidence results that combined forecasting models obtain better forecasting results than the single model in Zhang study [16] investigated early a hybridization of ARIMA [17] and ANN models.In this combined method, the linear correlation assembly of the time series is demonstrated through ARIMA model, and remaining residuals, besides nonlinear part are modeled through ANN.This study assessed the proposed model with three real-life data sets: Zhang [16] showed that hybrid scheme provided reasonably better accuracy also outperformed each component www.ijacsa.thesai.orgmodel; additional extension in [7], [18] to enhance this methodology; therefore, signifying a similar but slightly modified for stronger combination technique.
Several techniques have been conducted to combine different time series models, with ARIMA such as, Aladag et al. [19] proposed a hybrid model using nonlinear ERNN and linear ARIMA models investigated on the Canadian lynx data sets, this study reached a good prediction accuracy by using mean square error (MSE) as an evaluation measure.Furthermore, Javedani et al. [20] proposed ARFIMA-FTS hybrid model, validated by common data set to remain TAIEX, and DJIA, together with exchange rate data of nine main currency versus USD.Based on the reported results, it concluded to apply more effective hybridize methods in financial time series forecast, accordingly importance in this research field.
Recently, intelligence models of ANN recognized for its tendency to detect the non-linear characteristics present in the FTS data and hence; the ANN models were combined widely in the field of time series forecast with additional methods [21].Such as, Adhikari et al. established a nonlinear-weightedensemble method that considers both the separate forecasts besides the correlation between pairs of forecasts.Their structure could offer practically improved forecasting accuracies for three general time-series data sets [22].Similarly, Adhikari optimized ARIMA with FANN, EANN and SVM to predict eight-time series familiar data sets in stock exchange price's prediction; this study achieved significantly better accuracy than each single component model.Moreover, variety of neural networks can be utilized as well non-linear algorithms [23].Moreover, Khashei et al. proposed method that combined ARIMA methods and PNN algorithm [24].Experiential outcomes with three famous data sets for (British pound/US dollar, Wolf's sunspot and lynx data) indicate that hybrid models significantly outperformed than individual model's ARIMA, ANN, respectively and Zhang's hybrid (ANN/ARIMA) model with both error measures (MSE and MAE) so, that proposed method can be an effective technique.to capture accurate hybrid model [25].
There are also several studies on integration between EXP models with ANN as, Lai et al. study, which hybridizing EXP and ANN for financial time series predication to take full advantage of both linear and non-linear in the hybrid model [11].Furthermore, Yan Chan et al. [26] presented a novel ANN training method that employs the hybrid EXP-LM for shortterm traffic flow forecasting.The EXP model has been occupied to eliminate the lumpiness from traffic stream data before applying LM for training purposes.Results designate that, in overall, experiment errors acquired by EXP-LM are smaller than those obtained by the other established algorithms.Hua et al. they signified a novel hybrid model of FLANN based on KR for modeling and forecast of exchange rate between US dollar to British Pound, Indian Rupees and Japanese Yen data set.They process exchange rate data sets with KR to smooth the noise.In addition to that smoothed data sets are non-linearly extended using the sine and cosine increases before fitting to the FLANN model.The experimental results proved that the FLANN-KR hybrid model outperformed than equated models in different prediction aspects [27].
As mentioned above, the most important finding of these review is that single model can't achieve the requirement for forecasting accuracy in repetition, and there is not a single model appropriate to any condition.Scalability issue of ANN machine learning algorithms to deal with non-liner characteristics and ARIMA time series model to forecast linear parts.The scarcity of literature on hybridize EXP model and ARIMA with other non-liner methods in hybrid model.Furthermore, the scarcity of literature to assign limitations of ARIMA model in weighting problem for older observed values.
Therefore, this study proposed hybrid model depend on weighted moving average for linear features integrated with ANN considering financial time series features as, non-linear, non-stationary, trend and randomness noisy, and high degree of uncertainty for financial asset value's time series.
Finally, the organized report for this paper is summarized as follows: Section 3 briefly describes the individual models, Section 4, the proposed combination methods and Section 4, denote evaluation measurers.Section 5 consists of the data source of this study; the experimental finding results and discussions; Section 6 represented the conclusion from the study and imaginable future works.

III. OVERVIEW OF BENCHMARK MODELS
This study apply three models described as follows: A. ARIMA Model ARIMA models were introduced by Box and Jenkins.This methodology refers to the measures concerning identifying, fitting and checking to ARIMA models through time-series data, and forecasting group follows directly through an appropriate model formula [28].Equation ( 2) illustrates the ARIMA model as follows: Where t Y denotes the dependent variable at time t , and ti Y  response variable at time lags ti  , and i  coefficients to be estimated where {1, 2,.., } ip  , and t  denotes error term at time t [28, 29].

B. Exponential Smoothing Model (EXP)
EXP model is a technique based on weighted moving average for predicted values.This method gives less weight to old data.Equation ( 1) illustrates the EXP model as follows: Where 1  t F : Prediction of next period,  : fixed exponential value between zero and one, t y : New observe for time series y, t F : Previous smoothing predicted value for www.ijacsa.thesai.orgprevious period t using 1 t F  , taken into consideration that the model is based on the value of (α) optimal [12].

C. Artificial Networks Neural Model (ANN)
The equations ANN model is a mathematical technique designed to perform different tasks and duties.There have been many studies into the field of neural networks during the past periods but appeared evidently, preliminary from 1980.At present, the applications of neural networks have emerged clearly in several areas, for example, in the field of modeling, classification and prediction [30].
ANN Characterized by some qualities that assist them in reaching the distinctive solutions through its applications in the areas of purpose to identify the linear and nonlinear models [13].Fig. 1 illustrates a typical multi-level network, where the input node is used to insert the time-series data while output node is used to calculate the forecasts and contract Hidden and associated with appropriate conversion function used to process the data received from the input node [31].
 is a vector for weights between n hidden nodes and output node and  is a vector for weights between m input nodes and hidden node while, , [

IV. PROPOSED COMBINATION METHODS
In this section, we consider two methods to combine individual forecasts produced by the EXP, ARIMA and ANN models.In order to investigate best model for solving time series forecasting problem.
The combining methods included (additive combination method and linear regression combination method).Brief details about those combining methods are assumed below:

A. Additive Combination Method
This method consist of two parts linear and non-linear part (see ( 5)) the first experiment, compare between two combined models, namely, EXP-ANN and ARIMA-ANN respectively, as follows: Where 1 F represents the linear part and 2 F represents the non-linear part of time series.

B. Linear Regression Combination Method
In the second method, all models combined into hybrid model (i.e.ARIMA, ANN and EXP models) as shown in Fig. 3.Those three models fed by same input values while the output of each of them indicates independent predictors used for the hybrid model.In order to estimate the contribution weight for those predictors, we applied linear regression between them.Accordingly, the combination equation can be define as follows: In brief, the proposed hybrid methods consists of: 1) using ARIMA, EXP time-series analysis model to analyze the linear characteristics; 2) using machine learning MLP-ANN model to deal with nonlinear characteristics; 3) combined method for hybridization (additive and regression methods).Consequently, the predictions derived from all models are summed separately.Hence, the combined scenarios would the strength of both ARIMA; EXP and ANN model.

V. PERFORMANCE MEASURES
Two measures used to evaluate performance of proposed forecasting model include:

A. Statistical Measures
Several statistical measures are used in order to estimate model accuracy that have lowest error [32].Those measures are illustrated in Table 1.According to observation results, we used mean absolute percentage error (MAPE) as the best benchmark [33]

B. Similarity Fitting Test
Furthermore, empirical Cumulative Distribution Function (CDF) plot used as a visual measure to explain the difference between practical fittings of model observations compared with normal distribution of the data.Empirical CDF plot test comparing the deviation in actual values between the theoretical and the empirical experiment [34].

A. Exchange Rate Data Set
To implement the objectives of this study, investigate the daily exchange rate of the Euro against the Sudanese pound (SDG) in the Sudanese market, this data was collected from bank of Sudan, The data has a duration from the 3rd of July of 2016 to the 1st of December 2016.The time component is located in months.

B. Benchmark Models Results
To further, explain for linear models (EXP, ARIMA) and nonlinear ANN model are presented, and its ability in exploring the prediction pattern in the historical exchange rate data.These models are applied separately and integrated to demonstrate their predictability of real study for exchange rate.In addition, this paper submits a new hybrid model based on ANN, EXP, and ARIMA methods, which is constructed to predict SDG next day closing prices agonist EURO.To establish the validity of the proposed method, further procedure did by comparing the obtained results of single models with the results of proposed hybrid models.
After fitting individual models Fig. 4(a) illustrated the actual testing data set of SDG-EURO daily closing exchange rate price and predicted value of the single models (EXP, ARIMA and ANN).The output of five tests runs on the residuals to determine whether each model is enough for the data, to make the forecasting results more stable.Simple EXP model, ARIMA (0, 1, 1) and MLP 1-5-1 have been selected.Table 3 summarized the prediction values of the currently selected model fitting in historical data.It displayed statistic measures based on the one-ahead forecast errors, which have been used to generate the forecasts.As it can be observed from the Fig. 4(a) all models have generated a good predicting result.The forecast values are so close to the actual values.It can be observed that compared to the single predicting models ANN model is the best one for forecasting the SDG-EURO data with a higher fit ability and better forecasting accuracy.

C. Additive Combination Results
After fitting additive combination technique two hybrid models were generated, as showed in Fig. 4(b) illustrated the actual (SDG-EURO) closing price of the testing data set and the predicted value of the hybrid models (ANN-EXP, ANN-ARIMA).From Table 3 similarly can be observed that obtained forecast values from all utilized models are so close to the actual values.Table 2 summarized performance errors of each hybrid model fitting in historical data.From Fig. 4(b), ANN-ARIMA does not perform well when forecasting the SDG-EURO data, and the MAPE increased from 1.46% for ARIMA to 1.57% in ANN-ARIMA.This may be caused by weak forecasting stability of ANN, and although ARIMA can optimize its parameters that effect to improve its stability is weak.Besides, MAPE decreases from 1.76% of EXP to 1.59% for ANN-EXP.It can be proved that the forecasting ability of ANN-ARIMA is better than ANN-EXP, which is because that ANN-ARIMA can deal well with the data such as SDG-EURO time series.www.ijacsa.thesai.org

D. Linear Regression Combination Results
After fitting regression combination method by sum all models (ANN+EXP+ARIMA) one hybrid model generated, as presented in Fig. 4(c) which illustrated the actual (SDG-EURO) values from the data set and the predicted value from the hybrid model.Additionally, to estimate the weights of a composite model linear regression method determined that, according to regression equation of the preferred model as below: 0.35*( ) 0.40*( ) 0.95*( ) Correlation coefficient (r) between variables in hybridize equation equal to 0.83 which measured the efficiency of the composite model.It can be said that, the relation between these variables are positively correlated.From the evaluation measures in Table 2, it can be accepted that the forecasting ability of regression combination method for the proposed hybrid model (ANN+EXP+ARIMA) based on weighting method can improve the forecasting accuracy well as in MAPE value 0.82%, respectively.However, hybrid model can reduce MAPE within 2% the obtained forecasting quality and results showed in Fig. 4(c) and Table 2.The figure indicates that the hybrid model fitting on the (SDG-EURO) data perform well when measured by different evaluation metrics.Smaller MAE mean a mean higher forecasting accuracy.A lower RMSE indicates a better fitting degree of daily exchange rate, and MAPE is an index to evaluate the forecasting ability of the model.At present, for the data of SDG-EURO, the best standard is about 0.97%.From the average of MAE in five experiments, ANN has the smallest value, indicating the best forecasting accuracy.
What is more, the smallest RMSE cannot only mean that the hybrid model can fit the (SDG-EURO) time series well, but it can also prove that the forecasting results from the model are consistent.It can be proved that compared to the single forecasting model.Hybrid model is the most suitable for forecasting the SDG-EURO time-series data with a higher fit ability and better forecasting capacity.

E. Forecasting Analysis and Comparisons
Toward compare the performance of different models, first fitting for the benchmark (ANN, ARIMA, and EXP), to forecast the exchange rate, individually.The comparison of six models (EXP, ARIMA, ANN, ANN-EXP, ANN-ARIMA, and the ANN + EXP + ARIMA) according to five evaluation criteria (MSE, RMSE, MAPE, MAE and SD) as explained in Table 2.
Accuracy relative errors of all models are showed in Fig. 4(a), and (d).From Table 3, it also can be observed that the predicted values from all the utilized models are so close to the actual.Table 2 summarizes the performance errors of each hybrid model fitting in historical data.The empirical analysis confirms that the performance of all hybrid model's MAPEs are all within 2%, which indicate that the hybrid forecasting model has better performance.In detail, hybrid model (ANN + EXP + ARIMA) based on the weighting combination method which proposed in this paper can minimize the MAPE less than 2%; thus, relative errors of the hybrid model are very smaller than other models.This observation demonstrates that the weight combination method can reduce noise contained in time series and enhance accuracy.It can be proved that it has a very strong fit ability for non-linear data.3, the convergence of the actual values to predict values in the hybrid model, which confirms that the hybrid model is a convenient and efficient model to predict currency exchange rate price.Moreover, each method was run five times, and the standard deviation was calculated.It can be observed that the results of SDG-EURO exchange rate for all models are relatively small, which indicates that the models are not running randomly.

F. Similarity FittingTest and Analysis
Considering Fig. 5 and Table 4, exposed the estimated nth percentiles for all models.We typically use public 90th percentile as a benchmark for all tests.Create an empirical CDF graph to compare the fitted distributions for each model treatment and estimate the 90th percentile for each prediction population.We want to assess the efficacy of two combination methods designed to reduce the forecasting of SDG-EURO data.ANN-EXP and ANN-ARIMA models appeared not to reduce SDG-EURO predictive values, as demonstrated by the leftward shift in the fitted line and the longer mean predictive values (6.891, 6.880 as compared with 6.879 for the actual data).ANN-EXP and ANN-ARIMA models also seemed to decrease the variability in the predictive lengths, as evidenced by the steeper slope of the fitted line and the smaller standard deviation (0.1380, 0.05175 compared to 0.1973).Finally, Hybrid model appears to reduce SDG-EURO prediction, as evidenced by the leftward shift in the fitted line and the shorter mean predictive length (6.877 as compared with 6.879 for the actual data).Hybrid model also appears to reduce the variability in the predictive values, as evidenced by the steeper slope of the fitted line and the smaller standard deviation (0.1702 compared to 0.1973).However, appropriate tests would have to be conducted to confirm these observations.Weighted method is more efficient than the additive method.Hybrid model reduced the mean of predictive values to 6.877 and the standard deviation of 0.1702.

G. Comparison of Hybrid Model Performance with Literature
Finally, comparison process of the best hybrid model performance concluded this study compared to many aforementioned models in the literature, such as [15], [16], [20], [35], [36] explained in Table 5. Inside the compared error values for all models, the proposed model (ANN+EXP+ARIMA) acquires the lowest MAPE, which is 0.82%.Therefore, we can summarize that, the proposed hybrid model outperforms compared against investigative models in the literature.The superior performance of the hybrid model (ANN+EXP+ARIMA) result will influence each trend and regularity within the original time series, which significantly proved to enhance the financial series prediction with highaccuracy rate.Besides, was against to conventional ANN and ARIMA, EXP has a robust ability of generalization, robustness, fault tolerance and convergence ability.

VII. CONCLUSION AND FUTURE WORKS
Forecasting financial data is a big issue for time series analysts and researchers, and everyone in the scope of data mining.Despite numerous time-series models obtainable, the analysis for enhancing the effectiveness of prediction time series has not previously been stopped.To overcome the deficiencies of normally used model and yield results that are additional accurate.This study proposed two combination methods from cooperatively ANN machine learning model, EXP and ARIMA statistical models to capture both linear and nonlinear characteristics that will be detected in time-series data.The proposed method was applied to SDG-EURO exchange rate case study.Experimental results acceptable to prove that the proposed hybrid model (ANN+ES+ARIMA) significantly outperforms the additive method for financial modeling and prediction.It is a valuable means within the forecasting task, particularly once higher forecasting accuracy is required.This procedure supports the validity of the advised forecasting methodology.We can conclude with some Findings from this study: 1) Methodological contribution and significance to this study were conducted to propose an improved method for a hybrid model, to be applied in exchange rate forecasting then compared to the most related works in Section 2.
2) The proposed model try out many innovative combination method and experimental in the financial field for concerned parties and acquired a suitable results.In particular, our research on previous studies indicates that the practical application framework of the proposed model to identify objectively the weights of each then to combine these with linear and non-linear to build a forecasting model.
3) This study fills the knowledge gaps to highlight the importance and significance of ANN, EXP and ARIMA as predictors, providing the rationale for the proposed model.Thus, this study has a contribution and significance in methodological terms from the theoretical learning point of view.
4) Novel contribution to researchers the proposed model established its strength with promising results in the financial application fields of the exchange rate.The proposed model www.ijacsa.thesai.orgmakes a novel contribution to solve the problems of time series models weighting lack.
5) Scalability of evaluation measures by using both statistical measures test to estimate errors and similarity goodness of fit test by visual observation with empirical CDF to show that the proposed model outperforms the other listed models.
6) It is proved that the weighted method selects as the best combiner from suggested combination methods so that it is the best hybrid method.
Future work should revolve around a definitely unique hybrid combination model's paradigm with different single models.Moreover, to check the model strength more extension to this study by testing with different data sets.We tend to suggest that further experiments to estimate the weights of the combination methods.

Fig. 4 .
Fig. 4. Illustrated the actual CER closing price index and its predicted value from (a) all single models, (b) additive combination model, (c) hybrid model and (d) comparison between all hybrid models.
for aforementioned models.The following terminology explained that: if 1 ...
n yy represents a time series, then ^i y represents the th i predicted value, where in  , for in  , the

TABLE III .
SUMMARY OF PREDICTED VALUES AND ACTUAL OF ALL MODELS

TABLE IV .
ESTIMATED NTH PERCENTILES FOR EACH MODEL