Investigating Epidemic Growth of COVID-19 in Saudi Arabia based on Time Series Models

—Predictive mathematical models for simulating the spread of the COVID-19 pandemic are an interesting and fundamental approach to understand the infection growth curve of the epidemic and to plan effective control strategies. Time series predictive models are one of the most important mathematical models that can be utilized for studying the pandemic growth curve. In this study, three-time series models (Susceptible-Infected-Recovered-Death (SIRD) model, Susceptible-Exposed-Infected-Recovered-Death (SEIRD) model, and Susceptible-Exposed-Infected-Quarantine-Recovered-Death-Insusceptible, (SEIQRDP) model) have been investigated and simulated on a real dataset for investigating Covid-19 outbreak spread in Saudi Arabia. The simulation results and evaluation metrics proved that SIRD and SEIQRDP models provided a minimum difference error between reported data and fitted data. So using SIRD, and SEIQRDP models are used for predicting the pandemic end in Saudi Arabia. The prediction results showed that the Covid-19 growth curve will be stable with detected zero active cases on 2 February 2021 according to the prediction computations of the SEIQRDP model. Also, the prediction results based on the SIRD model showed that the outbreak will be stable with active cases after July 2021.


I. INTRODUCTION
The rapid and continuous spread of the Covid-19 pandemic throughout the world still represents a big dilemma for all countries. An according to the situation report-205 issued by the World Health Organization (WHO) on 12 August 2020, there are more than 20M infected cases of COVID-19 and more than 730,000 deaths globally [1]. King of Saudi Arabia (KSA) is the first largest country in the Arab world that infected with more than 293,000 infected cases and 3,000 deaths due to the COVID-19 Pandemic [2]. While stopping the spread of the infection is becoming an extremely big challenge according to the WHO, countries implemented some strict measures to control the infection growth. In KSA, the government applied some countermeasures such as issuing a social distance app," Tabaud", to notify citizens if they came into contact with an individual infected with COVID-19 [3]. KSA also prevented performing Umrah and sets some COVID-19 protocols, and restrictions for limited Hajj this year. Moreover, the Ministry of Interior issued a package of provisions and penalties for violators of the measures and protocols taken to suppress the pandemic spread.
Although KSA has been considered as of the first countries that took precaution and preventive countermeasures for curbing the COVID-19 outbreak and utilizing all its capabilities to suppress its spread, the taken countermeasures against COVID-19 until this time of writing this paper didn't zeroize the growth of infected cases in the kingdom. The Corona tracker report published on 21 august 2020 informed that KSA has 303,973 confirmed cases and 3548 deaths [4].
Harnessing predictive mathematical models for pandemics [5][6][7][8] is necessary and fundamental to trace the epidemic and to plan effective control procedures [9] [10]. Predictive mathematical models have long been providing various patterns of quantitative information in outbreaks as well as providing useful recommendations and guidelines to pandemic control and decision making. Hence, investigating epidemiological diseases mathematically becomes a very necessary and important issue [11]. www.ijacsa.thesai.org In this study, we try to use three-time series models, Susceptible-Infected-Recovered-Death (SIRD) model, Susceptible-Exposed-Infected-Recovered-Death (SEIRD) model, and the generalized Susceptible-Exposed-Infected-Quarantine-Recovered-Death-Insusceptible (SEIQRDP) model for predicting COVID-19 spread in KSA. The three models have been simulated on a real dataset obtained from [12]. Also, the performance of the three models has been investigated and tested across four periods of time on the used dataset. Then, the three models SIRD, SEIRD, and SEIQRDP have been tested for fitting data of COVID-19 spreading in KSA, and then a selection of the best-fitted models used for predicting the COVID-19 outbreak in KSA. Choosing the best-fitted models is based on calculating the least Mean Square Error (MSE), Mean Absolute Percentage Error (MAPE), and Mean Absolute Deviation (MAD) between fitted data and reported data. The simulation and experimental results proved that the SEIQRDP model achieved good prediction results regarding the pandemic growth and end.
The rest of this paper can be organized as: Section 2 discusses the three models, SIRD, SEIRD, and SEIQRDP models. Section 3 presents the simulation and experimental results. Section 4 discusses the obtained results. Finally, Section 5 presents the conclusion and future work.

II. LITERATURE REVIEW
The literature introduced several studies to mathematically study the infection growth of the COVID-19 outbreak. Time series analysis models are common techniques that have been utilized to effectively model, estimate, and predict the growth of the COVID-19 pandemic [13] [14]. Zeynep in [15] studied utilizing Auto-Regressive Integrated Moving Average (ARIMA) time series models to the spread of COVID-19 of three European countries most affected by COVID 19: Italy, Spain, and France. This study clarified that ARIMA models are appropriate techniques for forecasting COVID-19 spread in the future and provide a good understanding of the epidemiological stage of these countries.
Elmousalami et al. in [16] investigated three-time series models, moving average (MA), weighted moving average (WMA), and single exponential smoothing (SES) for creating a comparison of day level forecasting models on COVID-19 cases (i.e. confirmed, recovered, and deaths). The three models have been simulated on a real dataset, and the results indicated that the SEIRD model is the most effective and accurate technique for predicting confirmed, recovered, and death cases COVID-19 in Egypt.
Cooper et al. in [17] studied to study the effectiveness of modeling COVID-19 spread in different countries using the Susceptible-Infected-Removed (SIR) model. The simulation results showed the importance of modeling COVID-19 spread by the SIR model that can assist to estimate the impact of the pandemic by offering valuable predictions results in China, South Korea, India, Australia, USA, Italy, and the state of Texas in the USA.
Also, the SIR model can be used for investigating the predicting the peak and the end of the epidemic, the effect of the asymptomatic infection on the spread of COVID-19 outbreak, herd immunity variables, and social distance parameters [18].
The extended version of the SIR model is the Susceptible-Exposed-Infectious-Recovered-Dead (SEIRD) model that can be used also as another time series model for modeling COVID-19 spread [19]. Maguire et al in [20] used the SEIRD model for modeling COVID-19 spread in Sicily, Italy. The experimental results showed a good fit between reported data and estimated data using the SEIRD model.
The fractional-order differential equations add extra solutions in the study of the COVID-19 outbreak. So, the fractional version of many epidemical models have been studied in various works as in [5] and [21][22][23].

III. TIME SERIES PREDICTION MODELS
This section discusses three selected time series prediction models we used for predicting the epidemic growth of COVID-19 in Saudi Arabia.
In this paper, the generalized SEIQRDP model is used to visualize the epidemic growth of COVID-19 in Saudi Arabia with a comparative analysis with two models SEIRD and SIRD. SEIRD and SIRD models have been derived from the generalized SEIQRDP model. The following subsections give more explanation of the three models.

A. Susceptible, Exposed, Infection, Quarantined, Recovery, Death, Insusceptible (SEIQRDP) Model
In the SEIQRDP model ( Fig. 1), seven different states of infection transition can be considered in different analysis manner:  Susceptible cases S(t).
This model depends on six infection transition equations that can be used for studying the infection spread of COVID-19, Susceptible cases ( ), Infectious cases ( ), Exposed cases ( ) , Quarantined cases ( ) , Recovered cases ( ) , and Dead cases ( ). SEIQRDP system equations are given in (1).

B. Susceptible, Infection, Recovery, Death (SIRD) Model
This model depends on four infection transition equations that can be considered in the mathematical analysis of studying COVID-19 spread, Susceptible cases(t), Infectious cases ( ) Recovered cases ( ) , and Dead cases ( ) (Fig. 2). We modified the generalized SEIQRDP system equations model to produce the SIRD system equation model in (2).  We modified the generalized SEIQRDP system equations model to produce the SEIRD system equation model in Eq.3.

IV. SIMULATION AND EXPERIMENTAL RESULTS
The three mathematical models, SIRD, SEIRD, and SEIQRDP have been applied to the Saudi Arabia data set collected from [12]. The data set presents several active cases (i.e. infected cases and still infected), recovered cases, and death cases in Saudi Arabia between 2 February 2020 and 10 August 2020. Fig. 4 depicts and visualizes the three classes of our data set. The three models have been tested using three metrics, Mean Absolute deviation (MAD), Mean Square Error ( ) ( )

B. Testing SIRD, SEIRD, and SEIQRDP Models from 1/4/2020 to 10/8/2020
The actual reported data of recovered, active, and death cases (from 1/4/2020 to 10/8/2020) has been compared with the fitted recovered, active, and death cases results of the three models, SIRD, SEIRD, and SEIQRDP. Table II summarizes and compares the testing results, and Fig. 6 visualizes and compares the fitted results of the three models from 1/4/2020 to 10/8/2020.

C. Testing SIRD, SEIRD, and SEIQRDP Models from 1/5/2020 to 10/8/2020
The actual reported data of recovered, active, and death cases (from 1/5/2020 to 10/8/2020) has been compared with the fitted recovered, active, and death cases results of the three models, SIRD, SEIRD, and SEIQRDP. Table III summarizes and compares the testing results, and Fig. 7 visualizes and compares the fitted results of the three models from 1/5/2020 to 10/8/2020.

D. Testing SIRD, SEIRD, and SEIQRDP Models from 15/6/2020 to 10/8/2020
The actual reported data of recovered, active, and death cases (from 15/6/2020 to 10/8/2020) has been compared with the fitted recovered, active, and death cases results of the three models, SIRD, SEIRD, and SEIQRDP. Table IV summarizes and compares the testing results, and Fig. 8 visualizes and compares the fitted results of the three models from 15/6/2020 to 10/8/2020.        In this section, the three models, SIRD, SEIRD, and SEIQRDP have been simulated again to evaluate their prediction computations compared to the observed values within the period from 11/8/2020 to 4/9/2020. The prediction computation has been based on the best period within which www.ijacsa.thesai.org each model provided the least Mean Square Error (MSE) of fitting data as summarized in Table V. Fig. 9 depicts the prediction results of the three models within the period 11/8/2020 to 4/9/2020, a) SIRD model, b) SEIRD model, and c) SEIQRDP model.
According to these results, the SIRD and SEIQRDP are the best models that provided good prediction results within the period 11/8/2020 to 4/9/2020 compared to the observed data.  Hence, we used these two models to predict the end of the COVID-19 outbreak in Saudi Arabia. According to the prediction results of the SIRD model, the SIRD curve begins stable regarding detecting new active cases of Covid-19 after July 2021 as depicted in Fig. 10. Also, according to the prediction results of the SIEQRDP model, the SIEQRD curve begins stable with Zero active cases of Covid-19 at 2/2/2020 as depicted in Fig. 11.

VI. DISCUSSION
Prior studies that have noted the importance of using various mathematical models for predicting the spread of COVID-19 pandemic are different. Most of these studies have been established based on time series models [13][14][15][16][17][18][19][20], and other studies have been based on differential equations models [5] [21][22][23]. However, very little effort was found in the literature on the question of predicting the end of the COVID-19 outbreak in Saudi Arabia.
The most interesting finding was that the SIRD and SIEQRD are the best models for predicting the pandemic growth and end in Saudi Arabia based on evaluating the average of Mean Square Error (MSE). The results clarified that the SIEQRD curve begins stable with zero active cases at 2/2/2021as depicted previously in Fig. 11. Another important finding was that the SIRD curve begin stable with active cases of Covid-19 after July 2021 as depicted previously in Fig. 10.
However, the SEIRD model doesn't provide satisfactory prediction results compared to SIRD and SEIQRDP models.
These results seem to be consistent with other research that found that the SIRD model [24][25] and SEIQRDP model [26] are effective models for predicting Covid-19 pandemic growth. However, this study is contrary to [19] [20] that defends upon the SEIRD model as a good prediction model www.ijacsa.thesai.org for estimating the pandemic growth in Italy. This inconsistency may be due to the difference in the features of the data set while the model is simulating on different datasets. Hence, these findings may be somewhat limited by the features of the datasets and simulation environment. The present results are significant in at least two major respects, comparing three prediction models for predicting the pandemic growth curve of the COVID-19 outbreak in Saudi Arabia, and predicting the end of the pandemic based on the best prediction model.
A further study with more focus on using differential equation-based models is therefore suggested to study the Covid-19 outbreak growth and predicting its end by conducting different simulations with different datasets in different countries.

VII. CONCLUSION
The present study was designed to investigate applying three-time series models for studying COVID-19 growth in Saudi Arabia. The study simulated the mathematical systems of SIRD, SEIRD, and SEIQRDP on a real dataset of Saudi Arabia. The study presented a set of comparative analyses on the used dataset for investigating and evaluating the effectiveness of the three models in predicting the COVID-19 pandemic growth as well as predicting the end date of this outbreak. The finding of this study clarifies that SIRD and SEIQRDP models provide good prediction results about the pandemic growth and its end date in Saudi Arabia. The prediction results showed that the Covid-19 growth curve will be stable with detected zero active cases on 2 February 2021 according to the prediction computations of the SEIQRDP model. Also, the prediction results based on the SIRD model showed that the outbreak will be stable with the detected active cases after July 2021.
This new understanding should help to improve predictions of the impact of using SIRD and SIEQRD models for studying the COVID-19 growth curve in different datasets that have various infection dynamics in different countries.
For more prediction accuracy, a further study with more focus on using differential equation-based models is needed to study the Covid-19 outbreak growth and predicting its end. This can be achieved by conducting different simulations on different datasets in different countries using some differential equations-based models.