Air Quality Prediction (PM2.5 and PM10) at the Upper Hunter Town-Muswellbrook using the Long-Short-Term Memory Method

Air quality is crucial for the environment and the life quality of citizens. Therefore, in the present study a software application is developed to predict air quality on the basis of 2.5 particulate matter (PM2.5) and 10particulate matter (PM10), in the city of Upper Hunter, Australia, as it is considered to be one of the cities with the lowest air quality levels worldwide. For this purpose, it has been decided to use the methodology of long-short term memory (LSTM) from data collected by NSW department of planning industry and environment during the period of 30 September 2012 to 30 September 2019, to predict the behavior of the mentioned particulate matter during the month of October 2019. A comparison between the average and maximum values suggested by the software and the actual values has been made and it is shown that the predicted results of the study are quite close to reality. Finally, the results obtained in this study may serve as a basis for local authorities to proceed with the necessary protocols and measures in case an alarming prediction occurs. Keywords—Air quality; long-short term memory (LSTM); 2.5 particulate matter (PM2.5); 10 Particulate matter (PM10)


I. INTRODUCTION
The effects of air pollution on health have been the focus of study in the past decades [1]. In the late eighties approximately, epidemiological studies have proved a relationship among air pollution levels and cardiovascular mortality, as well as hospital admissions and emergency room visits [2] in both developed and developing countries [3]; which leads to the recognition of air pollution as an influential and changeable determinant of cardiovascular disease in urban communities [4]. Furthermore, it has been estimated, according to the World Health Organization, that environmental air pollution is responsible for about 4.2 million premature deaths worldwide annually by 2018 [5]. This is why more citizens are recognizing the importance of air quality to their health nowadays [6]; as this not only impacts on the quality of life of the population, but also on their productivity, or school absenteeism in the case of young people, and therefore on their nation's GDP. As Xiang et al [7], quoted by [6], pointed out, high-resolution air quality data in the urban context are essential for the management of cities. Therefore, the present study aims to propose a software to predict the air quality, based on data previously collected by NSW department of planning industry and environment [8].
For this purpose, Long Short Term Memory (LSTM), a particular kind of Recurrent Neural Networks (RNN) [9], will be used since its effectiveness in air quality prediction has been demonstrated in various studies such as [9], [10] as well as its capability of learning long-term dependencies unlike RNN methodology itself [11], [12]; by adding memory cell into hidden layer, so as to control the memory information of the time series data [13]. LSTM has the form of a repeating block chain for learning the time series information having three basic "gate", named input gate, output gate, and forget gate [14], [15]; its steps will be further explained in the next section.
The present study will be accomplished through the collection of data from the city of Upper Hunter, Australia, since it is recognized as one of the cities with the greatest negative impact in terms of air quality; therefore, a prediction of the behavior of air particles is one of its most urgent needs to address this problem. Given that air pollution, in Australia, has been estimated to be responsible for more deaths than road accidents [16], likewise Australia has been considered one of the countries with the highest levels of asthma in the world due to this air quality [17], [18]. As a result, there have been public complaints recently where residents of Upper Hunter, Muswellbrook claim that air pollution in their city has become part of their daily lives, getting worse every day and affecting the public health of citizens [19]- [21].
Ultimately, the purpose of the study is to suggest a software capable of predicting air quality through the behavior of PM 2.5 and PM 10 during a period of 30 days. After that period, a contrast with the actual air quality level will be made to evaluate the accuracy level of the software.
The structure of this investigation will be divided as follows. In Section II, the methodology in conjunction with its steps to be developed will be presented. In addition, the case study, in which the research was applied, and the corresponding explanation of the object of study are found in Section III. Subsequently, the data already processed will be shown in comparative graphs between the predicted data and the actual measured value in the Results and Discussion section. Finally, the corresponding conclusions are given in Section V, indicating the advantages of the method and the proposals for improvements on the future. www.ijacsa.thesai.org II. METHODOLOGY Long Short Term Memory, usually known as LSTM, was introduced for the first time in 1997 by Hochreiter and Schmidhuber [22]. Its main function is to remember information for long periods of time [11], having their internal memory for processing sequences of inputs, by recording old and current data [23]. One of its advantages is that to address long time lag issues, LSTM can manage noise, spread patterns and constant variables, as will be used in the present study. And compared to finite-state automata [24] or hidden Markov models [25], LSTM does not demand prior selection of a limited set of states. In principle, it can deal with unlimited state numbers. Furthermore, as opposed to conventional methods, LSTM is able to distinguish rapidly from two or more separate occurrences of a specific item in an entry sequence, without relying on appropriate examples of short-term training [22].
The following steps will be carried out [11]: Step 1: The decision on which information will be removed from the processing cell will be made at this stage. This decision is made by a sigmoid layer () called the "forgotten gate layer" as shown in Fig. 1 [11]. This gate gives values between 0 and 1, where 1 represents keeping the value and 0 represents removing the value completely. This model basically tells us that the value to be predicted will be clearly linked to the previous values.
Step 2: The new information is decided to be stored in the processing cell. This is divided into two parts, in the first one a sigmoid layer called "input gate layer" will decide which values will be updated. Subsequently, a tanh layer will generate a vector of new candidate values, which will be added to the prediction cell. Finally, both steps were combined to update the prediction cell. Fig. 2 [11] is presented for further details.
Step 3: At this stage the old cell status, −1 , is updated to the new cell status . All the previous steps have already decided how to proceed, the only thing necessary is to execute them.  Candidate Values (C t ) [11]. The old state is multiplied by , leaving behind the things previously decided to forget. Next step is to add * ̃ . These are the new candidate values, scaled according to what we decided to update each state value. The representation can be seen in Fig. 3 [11].
Step 4: The final step is to decide about what we will produce. That output should be on the basis of our cellular state, but it will be a filtered version. In order to do this, a sigmoid layer will be executed, which will decide those parts of the cellular state that we are going to produce. Next, in order to push the values to be between -1 and 1, the cell state will be passed through tanh and multiplied by the output of the sigmoid gate, resulting in only the parts we decide to generate. For a better understanding observe Fig. 4 [11].

III. CASE STUDY
The present study is focused on analyzing the data recorded by the NSW department of planning industry and environment [8] regarding PM 2.5 and PM 10 in the context of Upper Hunter during the period of 30 September 2012 to 30 September 2019; through which it will be sought to create software capable of predicting the behavior of these particulates matter through to October 2019. In other words, the matter particles for this study were classified into two groups: particulate matter with a diameter of up to 2.5 µm solid dust particles, soot, among others; and metal particles whose diameter varies between 2.5 and 10 µm [26]. These two groups are very fine particles in the air that are measured by micrometers [27]; as a reference, it is need to be taken into account that human hair is about 100 micrometers [19].
Nevertheless, there is a distinction between PM 2.5 and PM 10 , as the first group is a stronger threat to public welfare than the second group [28]. As confirmed by studies that show that these particles are more likely to penetrate the respiratory system and deposit in the alveoli of the lungs with the possibility of reaching the bloodstream because of their small size [29]. Converting it as one of the main health hazards in large cities around the world [30]. Therefore, according to [31], several studies on the relationship between PM 2.5 and the mortality rate have been promoted, such as the one carried out www.ijacsa.thesai.org in the United States by [32]. However, this does not imply that PM 10 is not considered harmful to health, since it also greatly affects the eyes, nose and throat of citizens who are exposed to high levels of air and/or dust, where these particles are transported [33].
Regarding the data obtained, a total of approximately 5000 inputs were collected, of which the days that did not report values were removed from the calculation. Likewise, the high levels of non-standard values were used, as well as the low values for the origin of the measures, thus no data processing was carried out.
The raw data were used to create a neural network, in order to do this the annual trend of air quality parameters was analyzed, according to the steps described in the methodology each value that entered through the forgotten gate layer will be iterated 60 times for each new value, namely, for each value the previous 60 data will be used as reference. This process created a new value that will depend on the previous data and will be used as input data for the calculation of the next value, as well as successively until all the existing data are used.

IV. RESULTS AND DISCUSSION
The results are presented in Fig. 5, 6, 7 and 8.
The comparison between the actual and predicted average and maximum values of PM 10 in the period October 2019 are shown in Fig. 5 and Fig. 6, respectively.
Similarly, the actual and predicted PM 2.5 average and maximum values for the month of October 2019 can be observed in Fig. 7 and Fig. 8.
These results were obtained from the model implemented using advanced neural network tools such as Keras [34] and Tensorflow [35]. In this model, 130 iterations were used for the data relation for every 60 values analyzed. Nevertheless, due to the complexity of the neural equations of LSTM, without the assistance of a powerful GPU each training cycle can require 2 to 3 hours for a cycle of 130 iterations. For the present case, it lasted about 45 minutes for each prediction.    However, it should be noted that these surveillance data were worked on as they were found (raw data), i.e. no treatment was done if there were illogical values. The prediction in Fig. 5, 6 and 7 was almost exact at the moment of finding the trend, since we can observe very noticeable changes in the air quality during the 30 days; whereas, for Fig. 8, the variation during the first days was not much for what the software considered to be an almost constant trend. www.ijacsa.thesai.org The predicted values were found to be quite close to the actual behaviour of the particulate matter, therefore the trend predicting ability of the variation in air quality of the actual sampling is confirmed for both the average and maximum values of PM 2.5 and PM 10 as the more abrupt the variations in air quality, the more accurate the prediction is in assessing these changes.

V. CONCLUSION
Air quality prediction is of great importance for environmental protection. Considering the multivariate data in terms of PM 2.5 and PM 10 values of the information collected by the Upper Hunter station during the years 2012 to 2019 it was possible to verify that the LSTM method is valid for predicting behavior of the mentioned parameters in the future, allowing the development of protocols or procedures in case of an alarming prediction.
In the present work it is demonstrated that the development of a computer code whose purpose is to predict air quality can be developed from a raw data base. The prediction model based on LSTM makes good use of the time sequence of air quality information and, at the same time, allows its prediction accuracy to be improved. However, its limitation is that a large amount of historical monitoring data is required to train the prediction models. In addition, the training time is long depending on the quality of the prediction.
Similarly, the application potential of the LSTM method can be used for different needs, as presented in previous research. It can also be used as a management tool for evaluation projects or prevention measures.