2-D Deep Convolutional Neural Network for Predicting the Intensity of Seismic Events

—Machine learning has advanced rapidly in the last decade, promising to significantly change and improve the function of big data analysis in a variety of fields. When compared to traditional methods, machine learning provides significant advantages in complex problem solving, computing performance, uncertainty propagation and handling, and decision support. In this paper, we present a novel end-to-end strategy for improving the overall accuracy of earthquake detection by simultaneously improving each step of the detection pipeline. In addition, we propose a Conv2D convolutional neural network (CNN) architecture for processing seismic waveforms collected across a geophysical system. The proposed Conv2D method for earthquake detection was compared to various machine-learning approaches and state-of-the-art methods. All of the methods used were trained and tested on real data collected in Kazakhstan over the last 97 years, from 1906 to 2022. The proposed model outperformed the other models with accuracy, precision, recall, and f-score scores of 63%, 82.4%, 62.7%, and 83%, respectively. Based on the results, it is possible to conclude that the proposed Conv2D model is useful for predicting real-world earthquakes in seismic zones.


I. INTRODUCTION
Over the last decade, the number and magnitude of induced earthquakes have increased dramatically, and several technological innovations [1] for effective disaster management have been developed. Nonetheless, much work remains to be done to mitigate the impact and harm caused by uncontrollable natural disasters. The proposed study is an important step toward effectively implementing modern technologies for accurate disaster detection, which remains a critical and major goal for successful emergency management.
Every year, a large number of seismic events occur around the world as a result of the release of cumulative pressure in the Earth's mantle. Catastrophic earthquakes in hilly areas may have caused several collapses on high mountainsides [2]. These and other seismic events may cause environmental issues, critical infrastructure problems, and housing developments, ultimately resulting in tragic economic losses and human casualties. Furthermore, earthquake-caused landslides dam river systems, forming lakes that may be threatened by debris flows and outburst floods that endanger people and property downstream [3,4]. Characterizing and forecasting the geographical distribution of seismic activity landslides is recommended for disaster mitigation [5].
Determining the geographic patterns of earthquakes is difficult because the main determinants of earthquake occurrences are their parameters, landscape, soil characteristics, tectonic plates, and human impacts [6]. As a result, it may be difficult to predict where landslides will occur after an earthquake [7]. Over the last 20 years, many prediction models have been developed to pinpoint locations vulnerable to landslides caused by earthquakes, and they can be divided into two categories: (1) models with a physical and numerical foundation [8]; and (2) models for determining susceptibility [9].
The physically based models were developed using the mechanisms of gradient commencement and runout. The first of these methods, pseudo-static analysis, proposed that the earthquake force represents an additional permanent physical force to statically conservation equations [10]. Despite the fact that selecting a pseudo-static coefficient requires criterion and always yields conservative results, pseudo-static modelling is theoretically simple [11]. Following that, the stressdeformation assessment method was proposed as an extension of finite-element simulation that is capable of simulating slope dynamic deformation.
This method, which is based on mathematical calculations and has the potential to resolve physical issues such as complicated geometries, material properties, and boundary conditions, is appropriate for investigating the stability of artificial slopes [12]. Permanent-displacement examination was proposed shortly after its deployment to calculate the displacements of landslides caused by seismic activity. Its sophistication is somewhere between the two methods discussed above. Landslides are modelled as rigid-plastic bodies sliding on an inclined plane in this analysis. The Newmark model and its variants are the most commonly used models in permanent-displacement analysis [13].
Over the last few years, advances in EQIL mechanism analysis have greatly improved the accuracy of physically based models. Because of the enormous number of parameter values required, physical-based models can only be used in a limited number of locations [11].
Later, scientists began looking into vulnerable assessment methods that could reflect a possible link between earthquake detection and causal factors for identifying earthquake-prone areas [14]. Landslide susceptibility modelling has grown in popularity over the last decade due to rapid advances in technology, geographic information systems (GIS), and data analysis [15]. These models are divided into two categories: knowledge-driven approaches based on expert knowledge data and data-driven models based on historical landslide inventories and associated spatial landslide causative data [16].
The knowledge-based models, which use expert knowledge to explain the link between the incidence of landslides and causal causes in terms of quantity [17], believe the analytic hierarchy process to be the most representative. Numerous statistical and machine learning techniques have been developed to predict the likelihood of a landslide using datadriven models. These approaches primarily include multivariate logistic regression and artificial neural networks (ANN) [18]. According to the results, data-driven models outperform knowledge-driven models in susceptibility mapping. Data-driven models can predict earthquake dispersion patterns, which the human eye cannot [19].
The majority of the data-driven models discussed above, which are classified as classic machine-learning techniques [20][21], can represent a single layer of linear or nonlinear relationships between causal variables and the incidence of landslides. As a result, when dealing with complex data, such models are prone to overfitting or becoming trapped on a local optimum [22]. However, due to an earthquake inventory constraint that always captures an earthquake as points or polygons, these landslide prediction models rarely take the distinction between landslide source and accumulation into account.
In the last ten years, numerous deep learning algorithms have produced impressive results in computer vision, speech recognition, and intelligent robot control. Geohazard experts have gradually become aware of these algorithms' ability to exploit the potential of multiple relationships in massive data [23]. As a result, deep neural networks (DNNs), convolutional neural networks (CNNs), and their derivate models have been successfully used for landslide recognition and forecasting [24].
In seismic event prediction, there are some research gaps. The first issue in earthquake forecasting is a lack of data. Second, despite extensive research in the field of earthquake forecasting, forecasting accuracy remains low.
Third, the majority of studies employ machine learning and traditional methods that are highly dependent on the type of dataset and cannot be extrapolated to other earthquake cases.
In our research, we use the following contributions to solve these types of problems. First, we present a dataset of earthquake cases spanning the years 1906 to 2022. Second, for earthquake forecasting, we propose a Conv2D CNN model. We leave the groundwork for future research by highlighting various important parameters for training deep learning models.
Reminder of this paper is as following: Section II presents related works in the extraction and categorization of features from seismic events. Section III discusses the imagery capture procedure and demonstrates the proposed method. Section IV summarizes the findings as well as the testing and discussions surrounding the proposed approach. Section V contains a conclusion and suggestions for future research.

A. Machine Learning in Earthquake Forecasting
Artificial intelligence techniques have been widely used to predict earthquakes [25][26]. One study [27] looked at how previous seismic occurrences in long short-term memory could be used to predict earthquake penetration rates.
Several indicators were used to determine whether seismic activity would occur within the next five minutes, including magnitude, depth, time, place, statistics, and entropy factors. Based on a spatial analysis of magnitude dispersion, an automated clustering-based adaptive neuro-fuzzy inference system for earthquake prediction was proposed [28]. However, these techniques struggle to condense useful guidelines for EQP activities [29]. As solutions to the earthquake prediction problem, several superficial machine learning experiments, such as [29][30][31][32], have been proposed. Shi et al. [33] were the first to use an artificial neural network in earthquake prediction, and they also discovered a correlation between earthquake magnitude and epicentral severity.
Subsequently, a support vector regressive and hybrid neural network was created to predict earthquakes [34]. The important indicators in this study were extracted using the criterion of greatest relevance and least redundancy. Eventually, earthquake predictions were made using traditional machine learning methods [35]. Another study [36] used a principal component analysis-based random forest to generate new datasets and reduce data dimension in order to generalize preexisting prediction models. The results show that these generalized techniques outperformed current methods in terms of average accuracy. Nonetheless, variations in geological features limit their applicability.

B. Deep Learning in Earthquake Forecasting
Deep learning techniques are capable of calculating hundreds of complex indicators on their own. As a result, recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have piqued the interest of many earthquake prediction researchers (CNNs). For example, [37] developed static-stress-based criteria for forecasting aftershock locations without assuming fault direction. It also provided a more accurate method of predicting aftershock locations and pinpointing the physical factors that governed earthquake triggering while the earthquake cycle was still active. Due to the dynamic and unpredictable nature of earthquakes, [38] developed long short-term memory to investigate the spatiotemporal association between earthquakes at various sites.
They were also able to demonstrate the dependability and efficiency of their approach. However, it is difficult for DLbased EQP techniques to produce predictions based on historical data because they require a large amount of training data to ensure accuracy [25].
Several machine-learning techniques are used on historical earthquakes to predict impending tectonic events based on earthquake waveforms. These models are used in support vector machines, random forests, k-nearest neighbours, and artificial networks [39][40]. In this study, we focus on the most www.ijacsa.thesai.org powerful RNN methods for prediction on calm and seismic days, such as LSTM models. An artificial neural network was used in one study [35] to identify earthquake precursors using TEC data, while genetic algorithms were used in another study [41].
When using machine learning algorithms to identify earthquake precursors, the TEC data of the learning pattern is considered. A-TEC data irregularity may occur prior to the earthquake in certain machine learning-based situations. In Indonesia, particularly Sumatera, efforts have been made to identify earthquake precursors using machine learning methods based on N-Model Artificial Neural Networks [42]. According to the authors of [43], QuakeCast is a one-of-a-kind technique that uses global ionosphere TEC data to identify short-term earthquakes. Using a conventional logistic regression model and a deep learning ConvLSTM autoencoder, the proposed technique investigates whether signals in an ionosphere layer TEC dataset predict earthquakes.
Without explicitly modelling specific properties, deep learning was able to forecast earthquakes. As a result, more academics are turning their attention to deep learning techniques. The authors proposed a novel ground vibration monitoring strategy for MEMS-sensed data using a deep learning approach [44]. The following study created a network for magnitude estimation using convolutional and recurrent layers [45]. In subsequent research, ConvNetQuake was developed to identify nearby micro-earthquakes based on signal waveforms. They also show how ConvNetQuake works well with different types of seismic data. Lomax et al. used CNN to quickly describe the earthquake's location, magnitude, and other characteristics [46]. S. Mostafa Mousavi investigated CNN-RNN to predict earthquakes quickly by detecting weak signals [47].
Authors then estimated the likelihood of earthquakes on the Indian subcontinent by looking at the CNN network [48]. The author in [49] investigated another CNN earthquake damage assessment model. J. A. Bayona provided two well-known seismic models to evaluate seismic dangers [50]. The experimental results indicate that certain implicit traits may be able to approach the earthquake forecasting problem from a different perspective. Although deep learning techniques can fully exploit the hidden information in earthquakes, they are not theoretically interpretable. Table I   In this research, we want to characterize calm and earthquake days in the target station zone using total electron content (TEC) values from the ionosphere layer based on previous research in this area. The primary goal of this research is to predict faster earthquakes.

III. DATA
It was necessary to collect data for the training sample in order to build a mathematical model. At the same time, keep in mind that the model should have constant access to new data segments in order to predict within and for a specific time period.
It is worth noting that data from the Institute of Seismology of the Republic of Kazakhstan were available during the hypothesis' development. However, because this data was only uploaded once and there was no integration with seismology institute endpoints, there was no guarantee that it could be supplied continuously. In this regard, it was decided to supplement it with additional data from the United States Geological Survey (USGS) database, which is accessible via API.
A combination of datasets from earthquake.usgs.gov and data from the Institute of Seismology of the Republic of Kazakhstan was used. Where there were columns such as place, time, magnitude, and depth. After some transformations, the dataset took the form of [year, region, rolling_aggregations_over_the_retro_data (depth and magnitude), binary target (where 1 means there will be a devastating event, and 0 means there will not be a devastating event)].
There were 2629 events detected in the Kazakhstan area from 1906 till 2022. The final shape of the dataset after all aggregation transformations is 1170 observations and 352 parameters.
A combination of datasets from earthquake.usgs.gov and data from the Institute of Seismology of the Republic of Kazakhstan was used. Where there were columns for place, time, magnitude, and depth. After some transformations, the dataset looked like this: [year, region, rolling aggregations over the retro data (depth and magnitude), binary target (where 1 means there will be a devastating event, and 0 means there will not be a devastating event)]. The actual records span the years 1906 to 2022. There are approximately 2.5 thousand records of unique earthquake events for this period in Kazakhstan and its surroundings.
In terms of the general population, the majority of the events occurred outside of Kazakhstan (approx. 300 events). However, we believe that these events may have had an impact on Kazakhstani territory, even if they were recorded on the territory of neighboring countries in Fig. 1. www.ijacsa.thesai.org  In order to test the hypothesis, it is advisable to divide the map of Kazakhstan into segments to predict the probability of a target event. The following methods were proposed for the segmentation of objects in the context of which predictions will be made:

1) Lithospheric plates 2) Conditional grid that divides the map of Kazakhstan (polygons)
3) Administrative areas In this approach, we will try to break the map of Kazakhstan into regions, and try to aggregate the indicators grouping by regions. Since most of the events took place outside of Kazakhstan, many coordinates could not be marked with an area. Out of 2.5 thousands, 300 events remained (~11%), which is insufficient for building high-quality analytics. Nevertheless, we managed to build a baseline from this amount of data.
It is proposed to assign shocks that occurred outside the Republic of Kazakhstan, though being close to a separate area, to mark the event with a nearby area. For a purpose of forming the target events and the training sample, the main groups are specified by Area and Year parameters, aggregating the following indicators: 1) Minimum values in the group by the "Magnitude" and the "Depth" parameters 2) The maximum value in the group by the "Magnitude" and the "Depth" parameters 3) Median value in the group by the "Magnitude" and the "Depth" parameters 4) The average value in the group by the "Magnitude" and the "Depth" parameters 5) Standard deviation in the group by the "Magnitude" and the "Depth" parameters 6) Number of events in a group Afterwards, we filled empty standard deviation values with 0. It is important to note that the sample is inconsistent over the years, as there are gaps without events between years, or they possibly were not recorded. For this reason, the following years were not indicated in the sample : 1980, 1981, 1982, 1983, 1984, 1985, and 1986.

IV. MATERIALS AND METHODS
In this section, we show the materials and methods used in this study. In the first section, we show how a proposed system architecture and feature extraction problem work. The following section demonstrates the proposed earthquake forecasting model. The final section shows evaluation parameters for comparing the proposed model to other machine learning models for the given problem.

A. Proposed System Architecture
In this research, we aimed to forecast earthquake magnitude prediction using deep learning techniques. Fig. 3 demonstrates a flowchart of our research for the prediction of earthquakes. Firstly, we get earthquake waveforms data and clean the data. The data cleaning or preprocessing process consists of four parts data cleaning, data integration, data transformation, and data reduction or dimension reduction. After preprocessing we train a deep-learning model for earthquake prediction. The architecture of the proposed deep learning model for earthquake prediction is presented in Section 3.2. The next stage is the prediction and evaluation of the proposed deep learning model.

B. Feature Extraction
The timeline is one of the characteristics of earthquake prediction model training. Fig. 4 depicts a timeline for feature generation. Because we are at the start of the current year and have retro information about previous earthquake events (aggregated depth and magnitude parameters), our prediction is for the maximum magnitude during the year. While smaller earthquakes can and do occur at all depths down to around 700 km, the largest earthquakes occur at shallower depths in the earth's crust. Earthquakes occur in the crust, the earth's highest layer, which ranges in thickness from 7 to 30 km. The earth's crust, which contains numerous fault networks that can cause earthquakes, is the planet's coldest and most vulnerable region. These earthquakes are caused by frictional sliding on faults caused by tectonic stress accumulation.

C. Proposed Model
The proposed deep learning prediction architecture employs the Convolutional neural network strategy, as illustrated in Fig. 6. A Rectified Linear Unit (ReLU) activation function layer precedes the Maxpooling2D layer with a filter size of (33) and is followed by a Conv2D layer with a size of (128) and a filter size of (33). The first epoch assigns the autoencoder model's received output to these layers. The second stage, like the first, employs a Conv2D layer with a size of (64). From the third to the eighth stage, only the Conv2D layer with a filter size of (33) and ReLU activation are operational.
The output of the eighth stage is flattened by the ninth stage. Following that, in stages 10 and 11, we use completely linked layers with 50 and 10 neurons, respectively. Two scenarios are also included in the proposed regression model. Initially, we use a single output neuron to estimate the magnitude. Second, we use three neurons to estimate both magnitude and position.
The important feature mappings, which are retrieved by each convolutional layer in the proposed CNN technique, are adapted as a matrix of pixels from an image. Each feature mapping is identified by equation (1) The critical feature mappings are extracted from an image as a matrix of pixels by each convolutional layer in the proposed Conv 2D convolutional neural network approach. Equation (2) identifies each feature mapping of the proposed approach. )) , The Maxpoolying layer, which is described by following formula is subsequently used to increase the number of feature maps, deepen the proposed neural network, and reduce the network dimension. Equation (3) describes maxpoolying layer for the given deep learning model.
where l and d are the Maxpooling window dimension.

D. Evaluation Method
The prediction outcomes are analyzed using the metrics of accuracy, precision, recall, and F1-score [43][44][45][46]. The accuracy indicator displays the rate of model prediction accuracy across all parameters. It is calculated as the proportion of correct predictions made by a model. This is especially useful when all of the courses have the same value. It is calculated by dividing the number of correct predictions by the total number of predictions made. This is the probability that the class will be adequately anticipated. The precision of the formula is shown in Equation (4).
Here, TP is true positives, TN is true negatives, FP is false positives, FN is false negatives. Sum of all this cases gives a number of all cases.
Precision offers a reliable picture of the veracity of our positive detections in comparison to the unchanging truth. How many of the objects we predicted in a given image matched the ground truth annotation? Formula (5) describes precision [47].

FP TP
When striving to accurately describe the extent to which our pessimistic expectations match the reality, recall or sensitivity is a good statistic to utilize. Out of all the challenges in our ground truth, how many favorable forecasts did we get? [47] FN TP The symbol F-measure stands for the harmonic mean of accuracy and completeness. This statistic decreases as accuracy or completeness approach zero. The formula for the F-measure evaluation parameter is shown in equation (7

V. EXPERIMENT RESULTS
This section presents the experimental results of the proposed model for earthquake detection and compares them to classical machine learning methods and state-of-the-art models. The confusion matrix for the given problem is shown in Fig. 4. The proposed deep model has a high prediction percentage, as evidenced by the results. Fig. 7. Earthquake timeline as a feature Fig. 7 depicts the results of earthquake forecasting. There, we show the results of three machine learning algorithms for earthquake detection after ten training epochs. As the results show, the light gradient boosting machine (lightgun) outperforms other machine learning methods in terms of accuracy and ROC-AUC. In each evaluation parameter, the neural network has the lowest efficiency. Random Forest performs well in some parameters such as ROC-OUC, recall.  The area under the curve receiver operating characteristics (AUC-ROC) curve for the proposed model for earthquake forecasting problems is depicted in Fig. 9 in terms of the combination of false positive rate and true positive rate for 10 training epochs. The AUC-ROC curve is greater than 0.5, indicating that the proposed deep model is practically acceptable and feasible in real life. The horizontal axis stands for false positive rates, the vertical axis means true positive rate. The results show that the proposed deep convolutional neural network demonstrates high AUC-ROC value by achieving higher value than the other applied models in ten epochs. Obtained results show that the proposed model can be applied for real case (see Fig. 10).

VI. DISCUSSION AND FUTURE RESEARCH
As shown in the figures above, the results appear promising, though it is worth noting that the model is not stable due to a lack of training data. The hypothesis is that earthquake events occur in cycles, and that retro data based solely on magnitude and depth predictors can predict the future appearance of destructive earthquakes. Machine learning algorithms are clearly based solely on statistics, and they should not be regarded as magical black boxes. The model has no idea what an earthquake is. There is a hypothesis that goes something like this: "In order for computers to understand earthquake concepts, we should pass fundamental features that describe physical concepts of earthquake nature." Many suggestions are aimed at improving the model. For example, a hypothesis is proposed to test with new features such as: The impact of lithospheric plate movements: Other suggestions include: Scaling down the grid -Making predictions in quarters and months rather than years.
 Create a project with a trigger so that local emergency services can respond.

VII. CONCLUSION
The proposed approach demonstrates that the presented deep learning model outperforms approaches using traditional www.ijacsa.thesai.org machine learning methods and cutting-edge deep models that employ traditional features. The provided time series-based method has the potential to improve the accuracy of earthquake forecasting issues. The applied dataset of earthquakes for the years 1906 to 2022 with magnitudes between 4 and 7 increases by 8.5% using the proposed deep learning architecture with the provided features, indicating that the proposed approach is somewhat successful on datasets with a pretty large size. Recent studies have shown that massive data analytics and machine learning can improve earthquake prediction accuracy. The proposed deep learning model performed well in earthquake forecasting, with accuracy, precision, recall, fscore, and AUC-ROC of 87.4%, 63%, 82.4%, 62.7%, and 83%, respectively. Specifically, incorporating the proposed deep learning architecture provided spatial and temporal characteristics, allowing earthquakes to be predicted to some extent.