On the Use of Arabic Tweets to Predict Stock Market Changes in the Arab World

Social media users nowadays express their opinions and feelings about many event occurring in their lives. For certain users, some of the most important events are the ones related to the financial markets. An interesting research field emerged over the past decade to study the possible relationship between the fluctuation in the financial markets and the online social media. In this research we present a comprehensive study to identify the relation between Arabic financial-related tweets and the change in stock markets using a set of the most active Arab stock indices. The results show that there is a Granger Causality relation between the volume and sentiment of Arabic tweets and the change in some of the stock markets. Keywords—Twitter; Sentiment Analysis; Granger Causality; Pearson Correlation; Arab Stock Market


I. INTRODUCTION
Social media provides us with a valuable source of the opinions of the public about different topics from different domains.One of the most interesting domains is the financial domain.Twitter nowadays is used as a platform enabling its users to read and write a large number of messages called tweets.The setting of tweets is public, which allows the researchers to fetch the tweets and perform their research using these data [1,2].
The financial domain has special interesting characteristics since it most likely deals with money.Stock market prices and stock change prediction are the concern of many social media users who are interested in the financial market.
Several researchers are studying the relationship between tweets and financial markets.They focus on tweeting activity as well as tweets' contents (mainly tweets written in English).The task of studying the relationship between the sentiments conveyed in tweets (written in English) and stock market change is a very interesting one with many obvious applications [3,4].
The same task can be much more difficult when the language under consideration is other than English.The reason behind this is very simple.The above task requires a mature set of tools and resources to efficiently analyze and accurately extract the sentiments conveyed in a large number of tweets, in what is known as Sentiment Analysis (SA).The field of English SA is the one that most likely fit this description.The same cannot be said about performing SA on other languages.Arabic, for example, is a very important language, but the field of Arabic SA is still at its early stages, which means that the tools and resources for Arabic SA are still not deep enough and are not yet tested thoroughly.
Another issue specific to studying the problem at hand outside the developed countries is the lack of proper data.E.g., for the Arab world, there is no commonly used stock market indices such as Dow Jones Industrial Average (DJIA).Instead of having a common stock index that is used across the Arab world (which can be very useful to conduct a study like ours), each country has its own index.
In this research, we study the relationship between the sentiments conveyed in Arabic tweets and the changes in stock market prices.The aim is to try to see whether there are causal relationships or not.This allows us to answer the question of whether we can use Twitter Arabic content to predict changes in the stock markets across the Arab world or not.To the best of our knowledge, there exist only one published paper addressing this problem for this specific part of the world [5].However, that paper is restricted to only one country, Saudi Arabia, whereas, our work is not limited to any specific Arab country.Also, [5] has many shortcomings.For example, the kind of analysis the authors performed is rather simplistic and the dataset they used is very small.The rest of this paper is organized as follows; Section II presents the related work.Section III presents methodology used.Section IV presents and discusses the results of this research.Section V presents a conclusion of our work.

II. BACKGROUND AND RELATED WORKS
In this section we present a set of related works that investigate the relationship between Twitter data and financial market.Before presenting the related works, we present a brief introduction about the tools and methods we employ to achieve our goal which include Sentiment Analysis, Granger Causality, and Pearson Correlation.

A. Background
Sentiment Analysis (SA) is a type of natural language processing (NLP) problems that is concerned with extracting the sentiment conveyed in a piece of text.I.e., SA is concerned with showing whether the author of the text feels positively or negatively about certain topics [6,7,8].SA for the Arabic language is challenging for many reasons such as the triglossic nature of the language, the limited literature on Arabic NLP, and the scarcity of dependable resources including publicly available datasets that are collected and annotated for SA purpose, sentiment lexicons that support the Arabic Language, etc. [9,10].
Granger Causality studies the relation between two time series in term of cause and effect.It uses the assumption that any event occurs at certain time must have had its causes occurring at an earlier time.Given two time series A and B, if A Granger causing B, then we can use the previous values of A to predict the value of B [11,12].
Pearson Correlation measures the linear dependency between two variables by measuring the strength and direction of the relationship between them.The strength of the correlation is in the range [-1, 1] and the direction states whether the correlation is positive or negative [13].

B. Related Works
As mentioned earlier, using Twitter data to predict stock market changes is a relatively new field.Ranco et al. [14] studied the relation between Twitter users' posts and financial market.They studied the sentiment of tweets regarding the 30 stocks companies that form the Dow Jones Industrial Average (DJIA) index in a period of 15 months.Their results showed that the Pearson correlation and the Granger causality between the tweets' sentiments and the financial data time series are relatively low.Moreover, their results showed a significant dependency between the tweets' sentiment and abnormal returns, where the sentiment polarity of the tweets implies the direction of the abnormal returns.
Zang and Skiena [15] performed a comprehensive study to show the relation between the sentiment presented in companies' related news and the companies' stocks volumes and financial returns.Their results showed a significant correlation between the news sentiment and stock market indicators.
Another study conducted by Souza et al. [16] presented a case study for the relation between the sentiments conveyed in tweets about a set of retails companies and the stock return for these companies.Their results showed that the tweets' sentiments have statistically significant relation with the stock returns and there is a strong Granger causality between the number of tweets and the stock returns.
Bollen, Mao, and Zeng [3] studied the relation between the mood states derived from Twitter posts and Dow Jones Industrial Average (DJIA) values.They analyzed the content of the tweets by feeding text into two sentiment tools, OpinionFinder (which gives positive and negative moods) and Google-Profile of Mood States (GPOMS) (which gives mood classification into six classes: Calm, Alert, Sure, Vital, Kind, and Happy).These mood states are used to study the relation with the stock prices changes.The results showed a significant Granger causality between the sentiment moods and the values of DIJA.Moreover, they used a Self-Organizing Fuzzy Neural Network to predict the values of DIJA using the sentiment moods.A similar study was conducted by Mittal and Goel [4] but with different content.They used only four moods of sentiment classes (Calm, Happy, Alert, and Kind).
Zheuldev et al. [17] studied the effect of the sentiments presented in Twitter content of the future stock prices of the S&P 500 index.They studied both tweets volume and sentiment presented in tweets at hourly resolution, and related them to hourly price returns of 28 financial companies collected over a period of three months.Their results showed that the sentiment presented in tweets is more statistically significant in leading financial market than message volumes.

Mao et al.
[18] study the relation between the daily numbers of tweets that mention the S&P 500 stocks and the closing stock prices.Their results showed that the daily number of tweets is significantly correlated with the daily closing stock prices.
Finally, one of the works that are most related to ours is that of AL-Rubaiee et al. [5].However, this paper is restricted to only one country, Saudi Arabia, whereas, our work is not limited to any specific Arab country.Also, [5] has many shortcomings.For example, the kind of analysis the authors performed is rather simplistic (one-to-one model) whereas we perform causality analysis and statistical correlation.Finally, the dataset used in [5] is very small (less than 2K tweets, most of which are used for training) compared to our dataset of about 1.5M tweets.However, the last point is justified by the fact that the authors of [5] relied on manual annotation whereas we rely on automatic lexicon-based tool that required no training from our side.

III. METHODOLOGY
This sections discusses the details of the methodology used in this work.The first step is to obtain both the Twitter data and the financial data.Then, the tweets sentiment and volume are computed.Next we run Granger causality test to identify the causality relationship between the tweets and the financial market.After finding the causality relationship, we run Pearson correlation to determine the type and strength of the correlations.The following sections present the details of these steps.

A. Obtaining Twitter Data
The Twitter data are obtained using the tweepy API1 by passing a set of keywords identified to be related to the financial market.We get this list of keywords from a financial expert who is aware of the social media and very familiar with Twitter posts.Table I presents the list of keywords used in Arabic along with their English translations.The tweets are collected starting from March 6 2016 till April 16 2016.The total number of collected tweets is 1,500,223.These tweets are collected from all days in the period of interest.As suggested in the literature, it is interesting in a problem like ours to consider the working days only.So, we filter out the tweets posted on weekends and we are left with 1,137,543 tweets.These numbers are large enough for our results to be trustworthy.

B. Obtaining Financial Data
The financial data are obtained from the most active Arabic Stock Market Indices. 2 Specifically, we focus on the following indices: Saudi Arabia (TASI), Abu Dhabi (ADI), Qatar (QSI), Dubai (DFMGI), Kuwait (KWSE), and Egypt (EGX30).For each of these indices we obtain the historical data in a daily manner.The historical data include information for the open, close, high, low, volume, and return values for a given day.www.ijacsa.thesai.org

C. Data Processing
In order to perform the causality testing, we need first to extract twitter data in daily manner and compute the volume and sentiment of the tweets.We then correlate tweets volume and sentiment to stock market returns.The stock market returns are available through the Arabic Stock Market Indices historical data under the name " " (change) which can be computed based on the Daily Stock market Return (DSR) as shown in Equation (1).
where C d is the closing price for a given stock at day d.
The stock market returns are available for working days only.In this research we handle the issue of weekend in two different ways.The first one is to completely ignore them.This requires excluding tweets posted during the weekends.The second way is to include weekends in the study.This requires an additional processing step to estimate/approximate the stock returns values for the weekend days.We follow the approximation process adopted by [14] which defined the Approximated Stock Returns (ASR) as shown in Equation (2).
This way of approximation simply takes the average of two days, one before and one after that day. Figure 2 shows the stock market returns of the Arabic Stock Market Indices in daily manner.Figure 3 shows the same distribution but with an approximation of the weekend stock markets return values.
As shown in Figures 1, 2, and 3, one can see a clear relation between the daily distribution of the tweets and the stock return values for the Egypt stock market index (EGX30).The neutral tweets count and total tweets shows a higher degree of correlation compared with the positive and negative tweets counts.The correlation between Twitter data and the stock return values for EGX30 is positive which means that, as the volumes of tweets increases, the stock return values increase.Other stocks market indices show a negative correlation where

D. Applying SentiStrength
The well-known SentiStrength [19] tool for SA is used to detect and measure the strength of the sentiments expressed in the tweets.SentiStrength was originally developed for English and was later adapted for other languages including the Arabic [20].To use SentiStrength for the tweets we collected, the default SentiStrength data files must be modified for the Arabic language. 3The output of SentiStrength is a sentiment score that falls in one of two ranges: in the range [1,5] for positive sentiment and in the range [-5, -1] for negative sentiment.The overall score is positive if the positive scores are greater than the negative ones.Similarly, the overall scoare is negative if the negative scores are greater than the positive ones.When the positive and negative scores are equals, the overall score is neutral.
The researchers in [19,20] conducted a comparative study between different SA tools that support Arabic.Their results showed that SentiStrength is the best tool that can be used measure the sentiment presented in Arabic text.

E. Computing tweets sentiments and volumes
To compute sentiments presented in tweets, we use the SentiStrength tool [19].After computing the sentiment scores, we perform a set of statistical operations that result with the measures used in this research to perform the causality and correlation analysis.The measures values are computed in daily manner.Following are the list of measures used in this research: Total Tweets, Positive Tweets Count, Negative Tweets Count, Neutral Tweets Count, Net Values, and Sentiment Polarity.
The Sentiment Polarity measure was proposed by [14].It is defined as Equation (3).

F. Applying Granger Causality
Granger causality is applied separately on the two sets of financial data we have (the ones without weekend values and the ones with weekend values approximation).Remember that when applying causality analysis on the data including the weekends, we need to approximate the stock markets values for the weekend days since the stock markets values are not available for the weekends.We perform the approximation based on the approach adopted by [14,4] using the average of the two days: before and after the weekends.

G. Computing Pearson Correlation
After identifying causality relationships between sentiment tweets and volumes and the stock return values, we run the Pearson correlation test to determine the direction and strength of the correlations between the sentiment measure used and the daily stock return values.

IV. RESULTS AND DISCUSSION
In this section, we present and analyze the results of the Granger causality testing process between the sentiment measures we compute from the tweets we collect and the financial data.This section also discusses the results of applying the Pearson Correlation test.
The result of the Granger causality test between two variables X and Y is presented as a hypothesis that the probability that variable X does not Granger cause variable Y .Higher probability values represent higher significance.When the probability value is less than a certain threshold such as 0.05 or 0.1, the hypothesis is rejected and the opposite hypothesis becomes true.The Granger causality test presents the results for a specific Lag which indicates the number of passing days included in the causality testing process.
In the following, we discuss the results depicted as tables.In these tables, the colored cells in the tables refer to the significance level used; yellow color is used for significance level 0.1 and green color is used for significance level 0.05.
Table II presents the results of Granger causality test between tweets sentiment and volume and stock change values for the Arab stock market indices used in this research.The table shows that for the Saudi Arabia (TASI) index, there are Granger causality at significance level 0.1 for the measures negative count at lag 5, neutral count at lags 3 and 2, and net value at lag 5, and Granger causality with the total number of tweets at significance level 0.05 at lag 5.For the Abu Dhabi (ADI) index, there are Granger causality at significance level 0.1 for the measure positive count at lag 4, and Granger causality at significance level 0.05 with the total positive count at lags 2 and 3.For the Qatar (QSI) index there are Granger causality at significance level 0.1 for the measures positive count at lags 2 and 3.For the Dubai (DFMGI) index there are Granger causality at significance level 0.1 for the measures positive and total counts at lag 2, neutral counts at lags 1, and negative counts at lag 5.For the Egypt (EGX30) index there are Granger causality at significance level 0.1 for the measures positive counts at lag 4, neutral counts at lags 3 and 2, and net values at lag 1, and Granger causality with the total number of tweets, sentiment polarity and neutral count at significance www.ijacsa.thesai.orglevel 0.05 at lag 1.The results show that the Kuwait (KWSE) index was not useful in studying the causality relationship.
Table III presents the Pearson correlation results for the measures that pass the Granger causality test.The table shows a negative correlation between the measures and the stocks indices for most of the indices.The only exception is the Egypt index which has a positive correlation with the measures.
Table IV presents the result of Granger causality test between tweets sentiment and volume and stock change values for the Arab stock market indices used in this research while including weekends and holidays.The table shows that for the Abu Dhabi (ADI) index there is Granger causality at significance level 0.05 for the measure negative count at lags 3, 4 and 5.For the Qatar (QSI) there is Granger causality at significance level 0.1 for the measure negative count at lag 5.For the Dubai (DFMGI) index there is Granger causality at significance level 0.1 for the measure positive count at lag 2. For the Egypt (EGX30) index there is Granger causality at significance level 0.1 for the total measure at lag 3, and Granger causality with the total number of tweets at lags 1, 2, and 4, positive count at lags 4 and 5 and neutral at all lags at significance level 0.05 at lag 1.The result shows that the Kuwait (KWSE) and the Saudi Arabia (TASI) indices were not useful in studying the causality relationship.
Table V presents the Pearson correlation results for the measures that pass the Granger Causality test.The table shows a negative correlation between the measures and the stocks indices for most of the indices except for Egypt index which has a positive correlation with the measures.
Our results conform with those of [14] where it was shown that the sentiment polarity is not able to capture the causality relation of all indices.Our results also conform with those of [15,16] where it was shown that the tweets sentiment and volume affect the stock prices change.The same can be said for [3,4] even if they used different sentiment behaviors that are based on the emotion presented in tweets content.Finally, our results conform with some of the results presented in [17,18] .

V. CONCLUSION
In this work, we study the relation between Twitter financial data and the stock prices change of a set of Arab stock market indices.The study is conducted with two sets of financial data.The first one includes only tweets for working days in studying the causality relationship between the tweets sentiment and volume and the stock prices changes.The result of this phase shows Granger causality relationship between the tweets sentiment and volume and stock change for most of the indices used in this research.Including weekends and holidays in the study has a negative effect on the causality relationship.The results of this study show that the best stock market index the can be used to study the relation between the twitter sentiment and volume and stock market change is the Egyptian index (EGX30).

Figure 1
Figure 1 presents the Twitter data volume and sentiment daily distributions.The column bars in the figure are for the weekend days.The figure shows that the average numbers of total, positive, negative and neutral tweets are 37,506, 4,240, 7,106 and 26,159, respectively.

TABLE I :
Arabic financial key words list

TABLE II :
Granger causality test results for working days only