Trading Saudi Stock Market Shares using Multivariate Recurrent Neural Network with a Long Short-term Memory Layer

This study tests the Saudi stock market weak form using the weak form of an efficient market hypothesis and proposes a recurrent neural network (RNN) to produce a trading signal. To predict the next-day trading signal of several shares in the Saudi stock market, we designed the RNN with a long shortterm memory architecture. The network input comprises several time series features that contribute to the classification process. The proposed RNN output is fed to a trading agent that buys or sells shares based on the share current value, current available balance, and the current number of shares owned. To evaluate the proposed neural network, we used the historical oil price data of Brent crude oil in combination with other stock features (e.g., previous day ( opening and closing price of the evaluated share). The results indicate that oil price variations affect the Saudi stock market. Furthermore, with 55% accuracy, the proposed RNN model produces the next-day trading signal. For the same period, the proposed RNN trading method achieves an investment gain of 23%, whereas the buy-and-hold method obtained 1.2%. Keywords—Time series; neural network; long short-term memory; stock price; Tadawul


I. INTRODUCTION
Of all the presented works for forecasting stock markets, only very few have targeted the Saudi stock market. In this study, we presented a recurrent neural network (RNN) that utilizes the long short-term memory (LTSM) architecture for a multivariate time series prediction to generate a trading signal (buy, sell, or do nothing) for several Saudi stock indices that will be used in combination of a trading algorithm to buy and sell shares based on three factors: share current value, current available balance, and a current number of shares owned.
Neural networks have gained much attention in recent years, especially in stock market prediction. The nature of the randomness accosted with the stock market makes it hard to achieve high confidence in predicting the index price using normal statistical methods. By using neural networks with several futures, we can achieve a high prediction value. To study the effect of past historical prices on future prices and to develop a trading agent using neural networks, we tested the Saudi stock market for the weak form efficiency. In producing a trading signal, the developed neural network is an RNN with an LSTM architecture.
The remainder of the paper is organized as follows. Section II gives a literature review on the works undertaken to predict and forecast the stock market price. Section III tests the weak form of the Saudi stock market efficiency. (The test is useful for understanding the effect of the historical data of a share on future values.) Discussion on the proposed method and a brief neural network introduction is presented in Section IV. Section V evaluates the proposed method and compares it to a known trading method. Finally, we give the conclusions of this study in Section VI.

II. LITERATURE REVIEW
Recently, the stock market prediction has been a hot topic in the research field. To predict stock prices, many researchers have developed methods, but only a few have developed a trading strategy. Some of the reviews of the developed methods are published [1,2]. For example, Shah et al. classified stock prediction methods into four categories: statistical methods, pattern recognition, machine learning, and sentiment analysis.
The autoregressive integrated moving average (ARIMA) model, which is one of the well-known statistical methods, uses a class of models to model the time series based on historical values. The model is fitted to the historical values of a stock price in predicting (forecasting) the stock's future price. The model consists of three parts: (1) an autoregressive (AR) model, in which the forecasted value is a linear combination of past lagged values; (2) a moving average (MA) model that forecasts the future value using the past forecast errors; and (3) the difference operation of past and future values. The model is denoted by ARIMA(p, d, q), where p is the order (number of time lags) of the AR model, d is the degree of differencing, and q is the order of the MA model.
Pattern recognition is closely related to machine learning but with a different implementation. Here, we focus on the methods of finding patterns in the stock's historical values. Then, by using computer algorithms, we predict future values using these patterns. Previous studies show an example of a pattern: the stock uptrend [3] and the open high-low close price candlestick charts [4]. www.ijacsa.thesai.org Machine learning prediction uses historical data and the desired output as the training sets to build a mathematical model through an iterative process until an objective function is optimized. Previous studies have shown the usage of classification and regression as examples of machine learning in trading methods and the closing price of stock [5,6,7].
In sentiment analysis, it uses text information, such as news articles or social media feeds on stock markets. In predicting stock trends based on the feed provided, the analysis employs machine learning algorithms [8].
Idress et al. [9] built an ARIMA model to predict the Indian stock market, in which they found a deviation on a 5% mean percentage error.
Meanwhile, to predict the Saudi stock prices, Olatunji et al. [10] proposed an artificial neural network (ANN) model, applying on three major stock indices: Alrajhi bank, Saudi Telecom Company, and Saudi Basic Industries Corporation SABIC stocks. They only used the previous-day closing price as the model input. Moreover, the proposed model was used as an investment adviser, and it achieved a low root mean squared error (RMSE) of 1.8174 and a mean absolute percentage error of 1.6476.
Also, Jarrah and Salim [11] proposed an RNN and a discrete wavelet transform (DWT) to predict the Saudi stock price trends. The model consisted of two stages. The first stage uses DWT to break the stock price into both frequency and time domains to filter the noise associated with the signals, and the second stage is an RNN that performs the prediction. The model was tested to predict the next-seven days closing price of the Saudi stock. The prediction result was then compared with that obtained by a prediction process performed using the ARIMA model. Consequently, the proposed model (DWT + RNN) achieved an RMSE of 0.0522 when the RNN model used four batches and four neurons.
Alotaibi et al. [12] also used an ANN model to predict the Saudi stock market. Their ANN model consisted of three layers: input, hidden, and output layers. Hua et al. [13] gave an introduction to deep learning with LSTM for time series prediction and proposed random connectivity for LSTM to overcome the computation cost.
Tilakaratne et al. [5] developed a neural network for predicting the trading signals of the Australian All Ordinary Index. Then, they compared an ANN to a probabilistic neural network (PNN), in which they found that the ANN outperformed the PNN.
On the basis of the previous studies mentioned above, many developed methods use historical information form the share itself without the combination of other factors (e.g., oil prices). These methods targeted different markets other than the Saudi stock market.

III. WEAK FORM OF EFFICIENT MARKET TEST
The weak form of an efficient market hypothesis states that the future prices of a stock market with a weak efficiency cannot be predicted using historical information, such as trading volume, closing price, and earnings. It means that one cannot predict future values using the available information. Fama [14] divided the efficient market hypothesis into three: weak, semi-strong, and strong hypotheses.
Previous studies tested the Saudi stock market efficiency in its weak form and concluded the same; however, the presented studies are not up to date [15,16].
To prove that the stock price under test can be predicted using historical values, we will be testing the Saudi stock indices used to evaluate the proposed RNN for the weak-form efficiency hypothesis. The weak form of the market efficiency for individual stocks is tested for randomness. If the stock does not follow a random walk, the hypothesis fails. The stock index can be predicted using historical data.
Several statistics tests are known for use in testing data randomness. Here, we used the Kolmogorov-Smirnov test (K-S test). The null hypotheses in the K-S test are that the data (stock returns) under the test follow a random walk, and the future value cannot be predicted. The alternative hypotheses are that the data under test are not random and that the data can be predicted using historical values.
Here, we used Alrajhi, Alinma, and SABIC stocks. The historical values are dated from January 2010 to the end of March 2020. The stocks' closing price was converted to the stock returns, as shown in Eq. (1), where R is the logarithmic stock return; l(i) is the day i closing price; and l(i − 1) is the previous closing price of the day i:

A. Kolmogorov-Smirnov Test
The K-S test is a nonparametric test for data randomness. The null hypotheses of the test assume that the cumulative distribution function (CDF) of the data under test is equal to the hypothesized CDF. The CDF of the data was computed herein and compared with the hypothesized CDF using Eq.
(2), where D n is the maximum amount of the hypothesized CDF (F n (x)) exceeding the calculated CDF (G n (x)). When both CDFs are equal to some factors, the data are random, and the test fails to reject the null hypothesis that the test statistics converge to zero as n goes to infinity. Detailed mathematical background on the K-S test is provided in [17].

B. Market Weak form Test Results
We performed the test on the three stocks used to evaluate the proposed RNN. Table I shows the result of the K-S test performed with a significance level of 0.05. (The p-value is the probability value of the test.) Smaller values (typically <0.05) indicate a strong rejection of the null hypothesis. The test statistic is a random variable calculated from the data under the test used in determining the null hypothesis rejection, whereas the z-value is the critical value. The K-S www.ijacsa.thesai.org test rejected the null hypotheses by comparing the p-value with the significance level. The null hypothesis is rejected if the p-value is less than the significance level (i.e., the data under test are not random).
Based on the test performed, Alinma, Alrajhi, and SABIC stock returns did not follow a random walk and were not independent of past values. This proved that the proposed stock prediction method and the trading agent could facilitate historical values to predict trading signals.

IV. METHODOLOGY
Neural networks are a set of algorithms used to recognize underlying relationships in data sets. The process of a neural network is similar to the operation of a human brain. Here, we used an RNN with an LSTM architecture to produce a trading signal.
The input to the neural network is called a feature, which is a measurable characteristic of the observed data or a characteristic with an indirect effect on it. Accordingly, this section provides a brief introduction to neural networks. The introduction aims to familiarize the reader with the basics of neural networks and provide them the ability to understand some concepts. A detailed background regarding this matter is reported in [18].

A. RNN
RNNs are a class of neural networks best used in sequenced data sets, such as time series. An RNN has a oneto-one connection between its internal layers and the exact position in the time series [18]. An RNN can simulate any algorithm given sufficient data. These networks are based on the works by Rumelhart et al. [19], who described a new method for teaching a network through backpropagation. Unlike feedforward neural networks, RNNs have an advantage in using their internal memory to process a sequence of data, such as stock markets. Moreover, the network input (e.g., oil prices and index price) in RNNs are interrelated. On the contrary, an RNN suffers from exploding problems and gradient vanishing. Gradient vanishing is a term associated with neural network training, and a gradient is a vector of the calculated error during the network training process. The gradient is used to update the network weights to achieve a small error, such as an error in predicted stock value when compared with the actual value. The gradients in an RNN accumulate during the update process, which causes it to explode (i.e., it becomes large and goes to infinity). Fig. 1 shows the basic building block of an RNN. The input to the block is a vectored time series x t . In our case, we used the stock price and associated features. h t is the output from the block to be fed to the subsequent titration at time t + 1. h t − 1 is the output from the previous block. Both h t and h t − 1 are called the hidden layer vectors. w h and w x are the weight vectors for the hidden connection and the input vector, respectively. The weight vectors are chosen by network training, which is achieved by comparing the output (predicted) with the actual value and adjusting the weight vectors to achieve the smallest error. F is an activation function within the block. Activation functions are mathematical equations that determine the block output based on preset conditions. The most important activation function is the tanh function. b t is a bias added to the block input. Equation (3) shows the math behind RNNs.

B. LSTM
Proposed by Hochreiter and Schmidhuber [20], LSTM is a type of RNN architecture used to solve the exploding and vanishing gradient problem that occurs in a normal RNN. The constant error carousel (CEC) LSTM was used to overcome the problems caused by the error back flow. The CEC controls the error flow by units, called gates, which are implemented in the memory block of the LMTS. The gates are categorized into the input gate, output gate, and forget gate, in which each gate has a function to achieve. The input gate controls the flow of the new sequence value. The output gate controls the usage of the value inside the cell using the activation function of the LSTM. The forget gate controls how long a value remains inside the memory cell. Fig. 2 shows the building block of an LSTM unit, where C t is the cell state, x t is a vector input to the cell, f t is the output from the sigmoid function that represents what cell state can be passed from adjacent cells, and i t is the output from the sigmoid function that represents the output from the tanh function of the input gate to the cell. This updates the cell state with new values. O t is multiplied by tanh of the cell state to choose what part to output to the adjacent cell. Fig. 3 shows three hidden units for a vector input in an LSTM network. This number can be more than three, depending on the design. Equations (4)-(8) are the compact forms of the forward pass of an LSTM unit that contains a forget gate developed in [21]. In the equations, W, U, and b denote the weights and biases determined by network training. Each layer produces a single output, called ht, which is connected to a neuron at the final layer. The function of the neuron is to multiply each input by weight and sum them up to F www.ijacsa.thesai.org produce an output ̂ with length n, where n is the number of classifications produced (Fig. 4).
The output ̂ is connected to a softmax layer, which functions to convert the input vector ̂ of n elements to a normalized probability distribution with n probabilities. The element with the highest probability is the network output. The produced classifications are two training signals: buy and sell. An in-depth discussion on pattern recognition and classification is shown in [22].

C. Trading
The proposed design was constructed using LSTM layers connected in series. The RNN input comprised a set of time series data representing the features associated with the stock and oil closing price. The network setup consisted of the training method, the number of hidden elements (LSTM units), and the number of training titrations. Fig. 5 shows a history of three stock prices in Saudi Riyals that was used in this study. The data will be divided into two sets. The first set will be used to train the classifier, and the other data will be used to evaluate the proposed classifier. Fig. 6 shows the history of the oil prices that will be used as an input to the proposed network. Table II lists the options used in constricting the network. Table III lists the features used for the buy and sell classification network. Several methods can be used for feature selection. However, in this study, we used a trial-and-error method to find the best feature combination because some feature selection methods fail when chart technical indicators are used in the stock price.  2) Network training: To obtain the required gains and biases in the hidden network layers, we must train the neural network. A data set must be prepared to perform the training and evaluation processes of the RNN. The required data were divided into two sets: a training set and an evaluation set. The data set comprised the historical values of the proposed futures from March 21, 2012, to April 24, 2020, and the required response (trading signal) of that interval. The trading agent responses were obtained from the stock returns, in which a buy signal was generated from a positive return, and a sell signal was entreated from a zero or negative return. The data were normalized using Eq. (9).

1) Input features:
Each training run computes the generated responses with the required ones. An error is produced if the response is different, and the weights are updated in each training iteration. Adaptive moment estimation (ADAM), developed by Diederik Kingma and Jimmy Ba [23], was used as a solver to optimize the weights and biases of the neural network. The following lists the process undertaken to train the LSTM network.
 Initialize the LSTM network weights and biases randomly.
 Input the historical data to the network as a normalized time series.
 Compare the trading signal output with the required signal (buy and sell signal).
 Update the weights and biases using the ADAM solver and the computed error.
 Repeat the training process until the classification accuracy is higher than that in the previous run or stop when the required number of iterations has been satisfied.
 The evaluation data set was used to test the network after network training. This process is called the classification process.

3) Trading agent:
The output of the neural network classification is connected to a trading agent. The presented trading agent strategy involves buying or selling a pre-defined number of shares in a trading session based on the number of shares and money currently owned. Fig. 7 depicts the trading process. The agent relies on the initial investment budget and the required shares to be bought and sold per trading session. These values are fixed in the current version of the trading agent.  The proposed neural network and trading agent were evaluated using three stock shares from the Saudi stock market (i.e., Alinma Bank, Alrajhi Bank, and SABIC). The evaluation data set comprised of historical values from June 2018 to August 2019. The performance of the proposed agent was compared with that of the buy-and-hold trading strategy. Table  IV shows the accuracy of the trading signal, trading agent initial values, and investment gain. The trading gain was affected by the initial values used, which were optimized to achieve the highest gain. Fig. 8 and 9 denote the output of the trading agent for the Alinma and Alrajhi stocks, respectively. The trading agent was effective for both the Alinma and Alrajhi shares, as shown by the output. The agent bought shares in an upward trend and sold them at the local maximum in several instances.    Fig. 10 shows the trading signal of the SABIC shares. The agent predicted the correct trading signals when trading the SABIC shares, but the gain was not high compared with that of the other two shares because of the fixed amounts of shares that can be bought per trading session. This low gain can be fixed if the number of shares is dynamic and linked to the classification layer output score.

VI. DISCUSSION AND CONCLUSION
To predict the Saudi stock trading signals, we proposed the usage of a multivariate RNN with an LSTM architecture. The model used historical stock information, such as closing prices, the volume of trades, number of trades, current-day opening prices, and oil price. The model result was satisfying compared with that obtained using the buy-and-hold trading method.
In future studies, we must consider more factors, such as the Fibonacci retracement, and develop a feature selection method to select the best feature among the presented features. Other financial trading methods may also be considered to train a neural network and develop a trading agent instead of relying on the prediction of future returns.