A Recurrent Neural Network and a Discrete Wavelet Transform to Predict the Saudi Stock Price Trends

Stock markets can be characterised as being complex, dynamic and chaotic environments, making the prediction of stock prices very tough. In this research work, we attempt to predict the Saudi stock price trends with regards to its earlier price history by combining a discrete wavelet transform (DWT) and a recurrent neural network (RNN). The DWT technique helped to remove the noises pertaining to the data gathered from the Saudi stock market based on a few chosen samples of companies. Then, a designed RNN has trained via the Back Propagation Through Time (BPTT) method to aid in predicting the Saudi market’s stock prices for the next seven days’ closing price pertaining to the chosen sample of companies. Then, analysis of the obtained results was carried out to make a comparison with the results from those employing the traditional prediction algorithms like the auto regressive integrated moving average (ARIMA). Based on the comparison, it was found that the put forward method (DWT+RNN) allowed more accurate prediction of the day’s closing price versus the ARIMA method employing the mean squared error (MSE), mean absolute error (MAE) and root mean squared error (RMSE) criterion. Keywords—Recurrent Neural Network (RNN); Discrete Wavelet Transform (DWT); deep learning; prediction; stock market


I. INTRODUCTION
Can time series analysis be utilised for estimating stock trends?The short answer is yes.Time series analysis can surely be utilised for estimating stock trends [1].However, the caveat here is that it is not possible to predict with 100% accuracy; the idea here to remain profitable is to be correct for more than 50% of the time.Time series prediction is important and has been overlooked often in the field of machine learning.It is noteworthy as various prediction problems are associated with a time component.These problems are overlooked as this component of time creates the issues regarding time series making them more complex to deal with [2].
Evaluation of time series includes developing models that encompass or explain an observed time series that help to understand the primary causes.In this study domain, the question "why" is sought for a dataset of time series.It often involves making assumptions regarding the data form and breaking down time series into its representative elements.The time series evaluation provides a group of methods that help to better understand the dataset in an efficient way, and breaking down of the time series into 4 constituent elements could perhaps be the most helpful amongst these [3].An example pertaining to the time series" components of Alinma Bank is shown in Fig. 1:  Level or Observed.The time series" baseline value if they were to be a straight line.
 Trend.The series" voluntary and often linearly increasing or decreasing behaviour that is gradual.
 Seasonality.The series" optional repetitive patterns of behaviour or cycles over time.
 Residual or Noise.The optional changes pertaining to the observations that could not be described by the model.
Each of these time series is associated with an observation or a level.Most of these include noise or residual, while the trend and seasonality are optional.In a time series to determine the stock closing price each day, a key role is played by the stock exchanges for the development of main economic sectors.Thus, these have a considerable impact on people and nations worldwide.Predicting the stock market has been an interesting topic sparking interest amongst numerous experts that hail from different domains [4].To keep a track on the stock market movement, various algorithms and methods have been used, such as Deep Neural Networks and Support Vector Machines [5,6].By assessing the historical information, these techniques can be applied to determine trends of the stock market.However, predicting the stock market is not an easy task due to non-stationary, non-linear, volatility and chaotic nature pertaining to the information [7,8].With the rise in trading, many experts have attempted to develop methods and techniques to assess and forecast future stock prices.
This study evaluates the application of the recurrent neural network to predict the stock market by employing real-time and experimental data pertaining to the Saudi stock market.This research plays a key part in the following manner:  The development of an innovative stock price estimation model that can use the RNN and DWT techniques.
 Employing real-time data sourced from the Saudi stock market to analyse and train the recommended model.
The research paper is structured as follows: Section II presents an overview of works pertaining to the estimation of stock markets.Section III explains the recommended model and its attributes, while Section IV presents and discusses the experimental results.Section V offers the conclusion and recommendations for prospective works.www.ijacsa.thesai.org

II. RELATED WORK
This section provides an overview regarding related studies that were focused on estimating and predicting the trends pertaining to the stock market.Presentation of various studies was done aimed at predicting the stock market"s behaviour for decades.Nevertheless, this effort turned out to be much difficult than anticipated since the stock market is characterised by its chaotic, complex and dynamic nature [9].Various data sources and parameters that include insignificant signal-tonoise ratio (SNR) need to be considered [10].Such parameters only increase the complexity of predicting the trends of the stock market.Even though numerous scholars have developed and proposed many different estimation models to forecast the stock market prices, they did not give accurate results [11].
The economic time series models derived based on economic theories are the key foundation in predicting a series of insightful information in the 20th century.However, these hypotheses cannot be applied directly to estimate and forecast market prices that cause an external impact.Due to the development of the multi-layer concept, artificial neural networks (ANN) are being opted as a tool to perform prediction.Apart from other techniques, researchers have employed different models for predicting the series of market prices by using deep learning methods.A paper has recently been published on estimating stock market prices by deploying ANN.Various other studies used different techniques to predict trends in the stock market.Tools that were employed for technology prediction were brought in for economic prediction, which offered extensive success.Lee, Lee, and Oh (2005) employed the Lotka-Volterra model that was actually derived by observing the relationship between the predator and prey.It is employed to predict the diffusion pertaining to rival technologies.They had employed this to predict the Korea stock market [12].Yu and Huarng (2008) used bivariate neural networks (BNN), BNN-based model of fuzzy time series with substitutes and BNN-based fuzzy time series by employing neural networks to predict fuzzy time series.The best results were obtained with the BNN-based model of fuzzy time series along with substitutes [13].Zhu, Wang, Xu, and Li (2008) used the improved and basic neural network models to show that trading volume can improve the performance of neural networks predicting [14].Zhang and Wu (2009) employed BPNN (back propagation neural network) and BCO (bacterial chemotaxis optimisation) on S&P 500 index and discovered that the put forward fusion model (IBCO-BP) gave superior prediction accuracy, lesser computational complexity as well as a lesser training time [15].Majhi, Panda, and Sahoo (2009) assessed CFLANN (cascaded functional link artificial neural network), and the LMS model and found the CFLANN model to work the best, followed by FLANN and LMS models [16].When ANN and ARIMA were assessed by Hamzacebi, Akay, and Kutay (2009), they observed that employing ANN for direct prediction gave superior result and also suggested to perform other studies before standardising the findings [17].Liao and Wang (2010) used the model stochastic time that was for the effective neural network to demonstrate that few of the predicting results pertaining to the worldwide stock indices as well as the model gave the predicted results [18].Patel et al. (2015) combined the machine learning models to forecast the trends for the stock market.The research emphasises on forecasting the future values for stock index.Few indices, such as CNX Nifty and S&P Bombay Stock Exchange (BSE) Sensex from Indian stock exchanges, were employed for assessing the experiment.The conclusions mentioned below have been deduced from the literature survey.
The current prediction models make use of the past data of the stock market for predicting the definite features for stock prices.The data need to be prepared to forecast the stock market prices post obtaining the descriptive factors by observing the changing average pertaining to the distinct block length, which provided balanced results in 20-100 days, and the current systems do not perform these functions [3,19] Qiu, M., & Song, Y. Two basic types of input variables (2016) were employed to forecast the trend pertaining to the daily stock market index.An optimised artificial neural network (ANN) model is employed in this study that contributes to the prediction of the trend pertaining to the next day"s price for the Japanese stock market index.This enhances the accuracy of the prediction for the stock market index trend in the future.Genetic algorithms (GA) were employed to optimise the ANN model [20].
Chen W. et al. (2017) applied a model based on RNN (Recurrent Neural Networks) with GRU (Gated Recurrent Units) for the prediction of stock volatility pertaining to the stock markets in China [21].Wei Bao, Jun Yue and Yulei Rao (2017) put forward a novel deep learning framework in which combination of stacked autoencoders (SAEs), wavelet transforms (WT) and long-short term memory (LSTM) was done to forecast stock price.When predicting for the first time, introduction of SAEs for hierarchically extracted deep features into the stock price is done [22].
Pang, X. et al. (2018) employed a novel neural network approach to obtain a better prediction of the stock market.Data were obtained via the running stock market to analyse in realtime and off-line as well as findings from the analytics and visualisations to demonstrate Internet of Multimedia of Things to perform stock market analysis.Based on the deep learning of word vector, the notion of "stock vector" was demonstrated.Now one index or one stock index cannot be characterised with the input, but rather the historical data that include multiple stocks along with high dimensions [23].As a high-tech technique, Fischer, T. and C. Krauss (2018) used LSTM (long short-term memory) networks for sequence learning.These networks are not often used in prediction of economic time series, yet these are apt for this field [24].www.ijacsa.thesai.org

III. PROPOSED MODEL
In this study, a two stages model is recommended for estimating the stock market price as depicted in Fig. 2. The first stage makes use of DWT to break down the stock price time series to remove noise, while in the second stage; a RNN is used to produce output that is seven steps ahead.The following shows the used methods for each stage in more detail.

A. Discrete Wavelet Transform
DWT finds application in numerous fields like financial time series and signal processing due to its strong feature extraction capacity.The main characteristic of the wavelet transform is that the frequency components of financial time series along with time can be analysed concurrently versus the Fourier transform.Consequently, a wavelet is helpful in dealing with financial time series that are highly irregular.This research work employs the Haar function as the basis function for wavelet since it not only break downs the financial time series into frequency and time domain but also decreases the processing time considerably [25,[29][30].The time complexity of O(n) is associated with the wavelet transform considering the Haar function as the basis, where n signifies the size pertaining to the time series.The wavelet function pertaining to continuous wavelet transform (CWT) can be represented as: (1) Where a signifies the scale factor, τ denotes the translation factor and ϕ(t) forms the basis wavelet.

B. Recurrent Neural Network
A class of artificial neural network is the recurrent neural network (RNN) in which connections exist amongst nodes form a directed graph in a sequence.This facilitates exhibiting the behaviour of temporal dynamic for a time sequence.In contrast to feedforward neural networks, RNNs can process sequences of inputs by employing their internal state (memory) [9,26].
In general, two different operations are associated with the hidden and output layers.First, the inputs coming from all the sources are added by the net operation.The out operation is involved in the second operation, which allows achieving a nonlinearity pertaining to its sigmoid function as well as calculated values tanh (net(S1) or SoftMax (net)S1) is employed) [27]. ( Where denotes the input for the current step.The function represents the nonlinearity like tanh or SoftMax as presented in Eqs. 2 and 3, which is required for estimation of the 1st hidden state and is initialised to 0 s.represents the weight at input .
signfies the input at time step .
denotes the weight from a previous step.
refers to the weight that was obtained from the hidden layers.
A common term employed to represent the calculation procedure is "Forward pass" that is used to compute prediction scores.Moreover, these values are amended via the backward back-propagation process.In this research work, employing of the Back Propagation Through Time (BPTT) algorithm was done as it allowed updating the weights by gathering the input values at various time intervals via RNNs training and efficiently frame the sequence prediction issues for RNNs [26].In the put forward DRNN model, which was developed to forecast the stock prices, allocation of a sequence for the sequential collection was done for a daily dataset of a single stock that fell within a fixed time frame (N days).The daily dataset employs sequence learning characteristics like the day"s closing in N days, which helps to explain the different stock performances.
The earning rate was employed to determine the sequence performance.The rate was determined by averaging the closing prices pertaining to the stock market for three days post sequence generation and then making a comparison with the final day of the present sequence.The put forward model includes a single input layer analogous to the one described earlier, including sequence learning characteristics.To calculate the hidden vector sequence, i.e. = ( …… ), the researchers employed the input sequence, i.e. = ( ,… ), by making use of standard RNN.The output vector sequence, i.e. = ( …… ), was calculated based on the put forward DRNN model.
(5) (6) In Eq. 5 and 6, the W terms signifies the weight matrices (e.g.refers to the input-hidden weight matrix), whereas the terms represent the bias vectors (e.g.exemplifies the hidden bias vector), and signifies the hidden layer functions.Generally, pertains to the elementwise application of sigmoid functions, which might be utilised for feeding the output layer [28].

IV. EXPERIMENTAL SETUP
This section is divided into three sections; the first section describes the dataset used in the experiments, the second section describes the method of prediction applied and measuring the accuracy.The last section shows the results obtained.

A. Data Description
This study employs past data related to stocks from the stock market of Saudi (Tadawul).It was a body functional in KSA and it was approved to operate as the Securities Exchange (also called the Exchange) and it also kept a record of open/close/low/high/volume daily.Tadawul consisted of 1300 data records for each company with 146 stocks which were launched in the duration from 2011/01/01 to 2016/03/31.About 190,000 series were gathered in total, from which 130,000 were used for training purpose while the remaining were utilised for validation procedures.In this research, for the recommended model, the scholars preferred the close price to serve as input for 6 separate firms belonging to various domains.All the firms had a considerable deviation in their outcomes.The data gathered from these firms displayed a smaller rate of error and was regarded as errorless.

B. Prediction Procedure
In this research, several tests were carried out to tune the parameters to get accurate outcomes.The initial RNN variable we will tune is the number of training periods.Another variable is the size of the batch which controls at what frequency the updating of the network weights takes place.The third variable is the number of neurons, which impacts the network"s learning ability.Moreover, the Adam optimisation technique is employed in RNN since it extends stochastic gradient descent which has seen widespread acceptance for its implementation in deep learning.Usually, the more the neurons, the greater the capability to learn the problem structure, though it takes a longer time for training.More capability for learning also raises the potential problem of training data overfitting.The below-given Table I explains the tests with parameters and prediction accuracy average (MAE, MSE, RMSE) for every case.
In the table, we observe that test 3 is the best.Table I displays details of the chosen sample that consists of 10 firms from the market of Saudi and S&P 500 index.Also, we acquired the past stock data from the website: https://macrotrends.dpdcart.com/cart/deliver?purchase_id=124 87241&salt=4526fab69067075ba5560b21f1850513b192ef77, in which the models applied were ARIMA and DRNN on the given sample.The outcomes of the S&P 500 tests are given in Fig. 9 and Table IV.
In this research, the sample contained six firms selected arbitrarily.In the process of prediction, the stock market closing price was regarded as a significant parameter due to the fact that it signifies the next day"s opening price.In the process of prediction, for each firm, the dataset of the stock market of Saudi is divided into two parts: a training set and testing set, for which the initial 1306 entries are used for training dataset while the remaining will be utilised for the testing set.Moreover, the dataset of S&P500 is divided into two segments: testing and training datasets.Here, the initial 1313 data entries are used as training dataset while the other remaining is applied for the testing dataset.By making use of the training dataset, models will be developed, while testing dataset will be used to make predictions.Every testing dataset"s time step will walk one by one.A model is utilised to predict the time step, then the expected value from testing dataset will be considered and made accessible for the model to predict the next time step.It serves as a replica for the real-world situations in which new stock market views would be accessible every day and which can be utilised to predict the following day situation.At last, all the predictions made on the basis of the testing dataset are gathered and an error value is computed to determine the model"s skill.The Mean Squared Error (MSE), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) will be used as it punishes large errors and results in a score that is in the same units as the predict data.

C. Numerical Results
Here, the experts utilised a Windows 10 OS for testing of the model, in which all the tests were carried out by using Python 3.6 which contained these libraries: firstNumpy in which the homogenous array with multiple dimensions is the primary object.It can be regarded as an element table (generally numeric values), all having the same type.Another is Pandas which is a package of Python that offers flexible, fast, and expressive structures which are devised to facilitate intuitive and easily operational "labelled" or "relational" data.Another library is the Sklearnit which is efficient and simple for assessment of data.Then, there is Keras which is able to run on Theano, MS Cognitive Toolkit or TensorFlow.Designed to facilitate fast testing with deep NN, it emphasises modularity, extensibility, and user-friendliness.Lastly, there is the Matplotlibit which is a library for plotting for Python language and its mathematical extension NumPy.Fig. 3, 4, 5, 6, 7 and 8, as well as Table II, show a sample firm in which this model was tested for the next 7 days.They also display all the predicted and actual prices of stocks with their accuracy of prediction which is obtained by computing MSE, MAE and RMSE as Table III below.Moreover, Fig. 9 illustrates the summary of the accuracy outcomes of the samples, applied through the proposed DRNN model.Further, we carry out the last test by using the dataset of S&P500 on the ARIMA and DRNN models.Table IV

V. CONCLUSIONS AND FUTURE WORK
Many researchers have investigated the problem of prediction in the stock market and developed two prediction systems.The first system addressed the issue of predicting the direction of the motion of the stock market index along with the stock prices.On the other hand, the second system predicted the future value of the stock market index.The predictions by these systems offered financial services to the users, which could help the users make proper decisions when they aim to invest in the stock markets.In this study, a novel hybrid model is proposed, which uses the DWT and RNN technique along with a technical dataset for predicting the stock prices.The proposed model could combine the data obtained from the stock markets with the DWT then to RNN and create a novel and simple optimization technique.Furthermore, this integration method could be used for determining and formulating better and improved techniques for reducing the risks and assisting the investors.
In future studies, the researchers must consider various additional factors for increasing the prediction accuracy, such as the Hajj period and Umrah, as well as the Islamic month of Ramadan as Ramadan is a globally celebrated Islamic religious tradition.These factors must be investigated for determining their effect on the prediction accuracy of the Saudi stock market.ACKNOWLEDGMENT I would like to acknowledge the Saudi stock market (Tadawul) since it provided all the necessary data for the research.I also wish to thank Mr. Ahmed A. Alsowaygh,

TABLE III .
ACCURACY FOR EACH COMPANY

TABLE IV .
PREDICTIONS RESULT FOR THE NEXT 7 DAYS (S & P INDEX) and Fig.9depict the outcomes of this experiment for the next 7 days.Then we are computing the MSE, MAE, and RMSE as TableVbelow.