Data Retrieval Method based on Physical Meaning and its Application for Prediction of Linear Precipitation Zone with Remote Sensing Satellite Data and Open Data

Data retrieval method based on physical meaning is proposed together with its application for prediction of linear precipitation zone with remote sensing satellite data and open data. The linear precipitation zone causes extremely severe storm and flood damage, landslide, and so on. Linear precipitation zone is formed in the case that the warm, moist air must continuously flow in, the force to lift this air up and it is often in collisions with mountain slopes or cold fronts, the atmospheric conditions are unstable, and there is a certain direction of wind above the sky. These conditions can be monitored by remote sensing satellite data. The proposed method is intended to attempt for prediction of the linear precipitation zone for disaster mitigation. There are water vapor data, cloud liquid data, cloud fraction data, and upper atmospheric wind data derived from the remote sensing satellite-based mission instruments. Through experiment in the case of the linear precipitation zone which was occurred in northern Kyushu, Japan in the begging of July 2020, a possibility to detect the linear precipitation zone was confirmed. Also, flooding damages and other disasters occurred in the northern Kyushu, in same time period caused by the detected linear precipitation zone is detected with Sentinel-1 of SAR data. Keywords—Linear precipitation zone; remote sensing satellite data; water vapor; cloud liquid; upper atmospheric wind; disaster


I. INTRODUCTION
Big Data Analysis: BDA (Data Science) is getting more important for Artificial Intelligence: AI research and applications. In the BDA, Storage & Management, Data collection, Data cleaning, Data retrieval, Data analysis, Data visualization, Data integration, Data language are important components. Data retrieval is a technology for extracting knowledge by comprehensively applying data analysis techniques such as statistics, pattern recognition, and artificial intelligence to a large amount of data. It often implies the expectation that heuristic knowledge acquisition is possible, which is difficult to imagine from the usual way of handling data. Data retrieval and analysis are key issues for remote sensing satellite data analysis. Essentially, remote sensing satellite data is big data and it is not easy to retrieve most appropriate data set for a variety research purpose.
Not to be confused with data extraction (which will be covered later), data retrieval is the process of discovering insights within a database as opposed to extracting data from web pages into databases. The aim of data retrieval is to make predictions and decisions on the data your business has at hand. IBM SPSS Modeler is the famous software tool for data retrieval followed by Oracle data retrieval. Also, Teradata, Kaggle are getting more popular. On the other hand, while data retrieval is all about sifting through data in search of previously unrecognized patterns, data analysis is about breaking that data down and assessing the impact of those patterns overtime. Analytics is about asking specific questions and finding the answers in big data. The conventional data retrieval methods are concentrated on statistical properties of the data without physical meaning of the data.
In this paper, a data retrieval method utilizing not only statistical properties but also physical meaning of the data in concern is proposed. Also, one of the applications of the proposed method of a data retrieval for remote sensing satellite big data retrieval is shown. There are some remote sensing big data platforms which provide a variety of discipline. For instance, Microsoft and the United States Ocean Atmosphere Agency (NOAA), a joint R & D agreement to develop the best way to extract data from internal systems. This will allow Microsoft to provide weather, water, and oceans provided by NOAA scientists and weather data hosted on Azure cloud platform. In order to extract most appropriate data, not only discipline by discipline and data type by data type, but also physical meaning by physical meaning of the data-based retrieval method is highly required. Therefore, relation among discipline, data type and physical meaning of the data has to be created and use it in the data retrieval is to be established.
In this paper, NOAA provided remote sensing big data platform JAXA's remote sensing data portal, and ESA's remote sensing satellite data portal are used, the most significant issue is how to choose most appropriate dataset, namely, data retrieval. The proposed method is based on the relation among discipline, data type, and physical meaning. Also, one of the applications is presented here, that is prediction of linear precipitation zone. All the linear precipitating sone related remote sensing satellite data are chosen from the platform based on a physical meaning.
In the following section, the research background together with related research and the proposed prediction method is (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 10, 2020 57 | P a g e www.ijacsa.thesai.org described followed by experiment together with experimental results. After that, concluding remarks and some discussions are described.

II. RESEARCH BACKGROUND
As for the data retrieval methods, knowledge discovery in databases is overviewed [1]. Principles of data retrieval is introduced, mainly the theoretical background of each data retrieval method [2]. Data retrieval as of concepts and techniques are well reported [3]. It is the Encyclopedia book with everything. Data retrieval as practical machine learning tools and techniques is published as tutorials on how to use different techniques and free tools Weka [4]. Also, "Basics of Data Retrieval" is published [5]. This is the book for beginners who can get a bird's-eye view of the whole. On the other hand, "Data Retrieval" is published of details on association rule extraction [6].
Meanwhile, as for the mid-latitude linear precipitation zone is classified as follows according to the internal structure. (1) Squall line type, (2) Back building type, and (3) Back and side building type. The linear precipitation zone is a phenomenon in which cumulonimbus clouds that because heavy rainfall is arranged in a row. It is characterized by a long range of 50 to 200 kilometers and a width of 20 to 50 kilometers, which lasts for hours. Guerrilla heavy rains that bring local heavy rains are generated by a single cumulonimbus cloud and can be set within about an hour in a narrow area of about 10 km square. The linear precipitation zone is like a guerrilla rainstorm procession. This phenomenon has been known since the 1990s, but it has been attracting attention due to a series of damages in recent years. How do linear precipitation zones occur? "The four basic conditions are likely to occur," explains Hiroshi Tsuguchi, a researcher at the Meteorological Research Institute, Japan Meteorological Agency.
First, the warm, moist air must continuously flow in. This becomes the "seed" of the cloud, which is an aggregate of fine water particles. Second, it then requires the force to lift this air up, often in collisions with mountain slopes or cold fronts. Third, the atmospheric conditions are unstable. Warm and moist air forms clouds at a certain height, but cumulonimbus clouds that can reach altitudes of 10,000 meters or more require cold air over thousands of meters and an environment where convection creates a strong updraft. Finally, there is a certain direction of wind above the sky. If these conditions are met, the generated cumulonimbus will be swept away by the wind, and moist air will be immediately supplied behind it, creating new cumulonimbus one after another. Repeatedly like a belt conveyor, a "back-building (backward formation) phenomenon" occurs and cumulonimbus clouds line up in a straight line, causing heavy rainfall directly below.
Local heavy rainfall has been increasing in the statistics of the past 40 years, and there is a possibility that global warming is affecting it. On the other hand, the linear precipitation zone tends to be flat, and there is no change in rainfall or precipitation range, and it is unclear how it relates to global warming. Even if anyone tries to predict the occurrence by simulation analysis of observation data, the underlying realtime data is insufficient. The surface data is relatively rich in the Japan Meteorological Agency's Amedas (Regional Meteorological Observation System), but cumulonimbus clouds and linear precipitation zones are three-dimensional. The radar and satellite observation networks that search the sky are low in density and lack enough accuracy, making it difficult to grasp three-dimensionally.
Linear precipitation zone is formed in the case that the warm, moist air must continuously flow in, the force to lift this air up and it is often in collisions with mountain slopes or cold fronts, the atmospheric conditions are unstable, and there is a certain direction of wind above the sky. These conditions can be monitored by remote sensing satellite data. The proposed method is intended to attempt for prediction of the linear precipitation zone for disaster mitigation. There are water vapor data, cloud liquid data, cloud fraction data, and upper atmospheric wind data derived from the remote sensing satellite-based mission instruments.

III. RELATED RESEARCH WORKS
As for the disaster mitigation by remote sensing satellite data, four dimensional GIS and its application to disaster monitoring with satellite remote sensing data is proposed [7]. An expectation to remote sensing for disaster management is also mentioned [8]. GIS and application of remote sensing to disaster management and monitoring is proposed [9]. The current status on disaster monitoring with satellites in Japan is announced [10]. An expectation on remote sensing technology for disaster management and response is proposed [11]. On the other hand, disaster monitoring with ASTER onboard Terra satellite is attempted [12]. Clearing house for disaster management is proposed [13]. ICT technology for disaster mitigation is proposed together with Tsunami warning system [14]. Meanwhile, cellular automata for traffic modelling and simulation in a situation of evacuation from disaster areas is published [15].
New approach of prediction of Sidoarjo hot mudflow disaster area based on probabilistic Cellular Automata: CA is proposed [16]. Sensor network for landslide monitoring with laser ranging system avoiding rainfall influence on laser ranging by means of time diversity and satellite imagery databased landslide disaster relief is proposed and well reported [17]. Visualization of 5D assimilation data for meteorological forecasting and its related disaster mitigation utilizing VIS5D of software tool is proposed [18]. On the other hand, disaster relief with satellite based synthetic aperture radar data is conducted [19]. Also, Sentinel 1A SAR data analysis for disaster mitigation in Kyushu is reported [20]. Flooding and oil spill disaster relief using Sentinel of remote sensing satellite data is well reported [21]. Convolutional neural network considering physical processes and its application to disaster detection is proposed and validated [22].
IV. PROPOSED DATA RETRIEVAL METHOD NOAA, NASA has the pathfinder program. It is identified four long time-series data sets from existing archives for reprocessing: the Advanced Very High Resolution Radiometer (AVHRR) data, the TIROS Operation Vertical Sounder (TOVS) data, Geostationary Operational Environmental Satellite (GOES) data, and Special Sensor Microwave/Imager (SSM/I) data. The following snapshot of the first project www.ijacsa.thesai.org processed by the Pathfinder Program --development of the AVHRR Land and Polar data sets --foreshadows challenges in scale and logistics that will continue to be hurdles of EOS-era data streams.
JAXA has a G-Portal. It is a portal system allowing users to search (satellite/sensor/physical quantity), and download products acquired by JAXA's Earth observation satellite. From these remote sensing satellite data portal sites, it is possible to choose and download the data for the specific purpose if users have a knowledge about relation among discipline, data type, and physical meaning of the data. The proposed data retrieval method allows choosing appropriate data from the portal by using the relation. In other word, the proposed method provides the relation (knowledge). If users specify keywords, then the method provides appropriate remote sensing satellite data.

V. ONE OF THE APPLICATIONS OF THE PROPOSED DATA RETRIEVAL METHOD FOR LINEAR PRECIPITATION ZONE DETECTION
By using the proposed retrieval method with the keyword of "Line Precipitation Zone in Kyushu, Japan in July 2020", then the remote sensing satellite data of MTSAT, AMSR, Rain Radar, Sentinel-1 SAR and the open data of Meteorological map, Topographic map data are provided because the proposed method has relation information among discipline, data type, and physical meaning of the data as well as subject (the keyword).

A. Line Precipitation Zone
Due to the effect of warm and moist air flowing from the night of July 3 2020 toward the low pressure and the rainy season front, the station was in the Satsuma and Osumi regions of Kagoshima prefecture from the night of the 4th of July 2020 to the morning of the 4th of July 2020, and in the southern part of Kumamoto from the morning of the 4th of July 2020 to the morning.
Due to severe local rain, the Japan Meteorological Agency announced a heavy rain special warning to Kumamoto and Kagoshima prefectures at 4:50 on the 4th of July 2020. It is possible that a linear precipitation zone with developed rain clouds formed in these areas. Fig.1 shows MTSAT (Gestational meteorological satellite) images of Japanese vicinity, weather (atmospheric pressure) map of Japanese vicinity, and rain cloud radar images of middle of Kyushu which are acquired during from June 28 to July 12, 2020. From the evening of the 5th of July 2020 to the morning of the 6 th of July 2020, it was raining locally in Satsuma and Osumi, Kagoshima Prefecture, and it was a record heavy rain in Kanaya city.
From 6 th to 8 th of July 2020, due to the stagnant front, it was raining locally in Nagasaki prefecture, Saga prefecture, Fukuoka prefecture Chikugo region, Oita prefecture and northern Kumamoto prefecture. From 7 minutes to 11:40 on the 7 th of July 2020, a heavy rain special warning was announced in Nagasaki, Saga and Fukuoka prefectures. On the 8 th of July 2020, there was heavy rain from the Tokai region to the Koshin region, Gifu prefecture at 6:30 on the 8 th of July 2020, and a heavy rain special warning was announced to Nagano prefecture at 6:43 on the same day. The Kuma River system that flows through Kumamoto Prefecture was inundated and destroyed at 13 locations in Yatsushiro City, Ashikita Town, Kuma Village, Hitoyoshi City, and Sagara Village, and about 1060 hectares were flooded. At Senjuen, a nursing home for the elderly in Kuma, 14 people died in a submerged facility. According to the Geographical Survey Institute's inundation map, the inundation depth in the Kuma-murato district, where Senjuen is located, reached a maximum of 9 meters. In Hitoyoshi City, a wide area of the city was flooded, and in the center of Sakamoto Town, Yatsushiro City, heavy damage such as driftwood and earth and sand flowing into the housing. In addition, Ashikita (died by a landslide in the Tagawa district, flooded at Sashiki station) and Tsunagi-cho (died by a landslide in the Fukuhama district) have also been damaged.
In Omuta City, Fukuoka Prefecture, 252 mm of "rainfall that we have never experienced" in 3 hours from 3:00 pm on July 6 th in 2020 was observed. A large amount of water flowed through the Suwa River (Sekigawa), but no flooding occurred. However, the amount of rainfall exceeded the capacity of Mikawa Pumping Station (Mikawa-cho, Saitama City), causing inland water flooding. Road flooding also occurred in Arao City, Kumamoto Prefecture. The Chikugo River flooded in parts of Oita prefecture in the upper reaches and Fukuoka prefecture in the middle reaches. Damage caused by flooding and flooding in Kurume City, Fukuoka Prefecture, and damage by inland flooding in Omuta City, Fukuoka Prefecture.
In addition, the Oita River in Yufu City, Oita Prefecture, caused an overflow in Shonai Town Higashichoho (around Onoya Station) and Hazama Town Shitaichi (around Tenjinbashi), causing Oita River tributaries (Hanaino River, Kurokawa, etc.) in various parts of the city. Due to frequent floods, debris flows, and sediment disasters, information on the occurrence of disasters was provided. In the Oita River, the water levels at the Dojiri Observatory in Yufu City and the Fuuchi Ohashi Observatory in Oita City, which are directly controlled by the national government, were the highest ever recorded.
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 10, 2020 59 | P a g e www.ijacsa.thesai.org  The linear precipitation zone is "created by passing or stagnating at almost the same place for several hours by an organized group of cumulonimbus clouds formed by rows of developed rain clouds (cumulonimbus) that occur one after another. A linear stretch of about 50-300 km, a width of about 20-50 km with strong precipitation rain (a forecast term used by the Meteorological Agency in weather forecasts, etc.). It has been pointed out in the 1990s that a linear precipitation area is often seen in Japan when a heavy rainfall occurs. Meteorological Research Institute Tsuguchi and Kato (2014) www.ijacsa.thesai.org [23] objectively extracted the case of heavy rainfall in Japan from April to November 1995-2009 and statistically analyzed the shape of the precipitation area. As a result, it was revealed that the linear precipitation zone occurred in about two-thirds of the cases, except for typhoons.
The entity of the linear precipitation zone is an aggregate of multiple cumulonimbus clouds and is considered to be a kind of meso-convective system. There are also cases with a hierarchical structure of "linear precipitation zonecumulonimbus group-cumulonimbus".
It seems to be the cause of localized heavy rain. According to an analysis of radar observations by the Meteorological Research Institute of the Japan Meteorological Agency, about 60% (168) of the 261 heavy rains other than typhoons that occurred between 1995 and 2006 were attributed to the linear precipitation zone. It occurs all over Japan and is common in Kyushu and Shikoku. Although the generation mechanism has not been fully clarified, four conditions that are likely to occur are "inflow of warm and moist air that is the source of clouds" "rises due to collision of the air with mountains and cold fronts" "cumulative cloud "Stable atmospheric conditions" and "wind in a certain direction generated by cumulonimbus" are mentioned.
Heavy rains in the northern part of Kyushu caused a developed linear precipitation zone with an altitude of 18,000 meters around Asakura City in Fukuoka Prefecture and Hita City in Oita Prefecture. In Asakura City, the amount of rainfall per hour exceeded 100 mm, and the amount of 24-hour rainfall reached 545.5 mm, the highest ever, with heavy rain continuing.
In recent years, heavy rainfall due to the linear precipitation zone has been occurring in various places. When Tsuguchi et al. analyzed radar observation data for 261 heavy rains other than typhoons that occurred between 1995 and 2009, two-thirds of the 168 cases were due to the linear precipitation zone.
It often leads to large-scale disasters. The 1982 heavy rainfall in Nagasaki recorded the highest hourly rainfall of 187 mm in Japan's observation history, the Tokai heavy rain in 2000 flooded about 70,000 houses, the Hiroshima heavy rain in 2014, and the 15 years in 2015. Heavy rains in Kanto and Tohoku occurred. The linear precipitation zone is also believed to be the cause of the 1957 Isahaya heavy rain, which resulted in 722 dead and missing.

B. Phenomena of the Line Precipitation Zone
One of the causes of the heavy rain this time is that the rainy season front has stagnated from western Japan to eastern Japan for a long time. This is probably because the surface temperature of the Indian Ocean is higher than normal. Due to the high sea surface temperature in the Indian Ocean, updrafts are more likely to occur, and the elevated atmosphere is more likely to descend in the ocean east of the Philippines. For this reason, it seems that the Pacific High goes over to the southwest rather than the north side, making it difficult for the Baiu front to move north, making it easier to stay in the Japanese archipelago.
The other is that a large amount of warm and moist air, which is the source of rain clouds, flows in one after another. This is probably due to the westerly meandering. On the west side of Japan (near the Yellow Sea), westerly winds meander to the south, forming a western valley where the air pressure drops, and it is easy for moist air around the Pacific high pressure to flow into the Japanese archipelago.
Therefore, not only rain cloud radar images, but also water vapor, cloud fraction, cloud liquid data have to be investigated. Fig.2 shows rain cloud radar images of Kumamoto, Kyushu, Japan which are acquired on July 6 2020 (most severe rainfall). It looks like a linear precipitation zone over the northern Kyushu, Japan. It was heavy rain continuously for more than 10 hours on July 6 2020.
The basic idea of the proposed method is based on the satellite-based water vapor, cloud fraction, cloud liquid data derived from the mission instrument data of microwave radiometer (Advanced Microwave Scanning Radiometer: AMSR). Estimate moisture supply to the Baiu front from the water vapor data, also estimate cloud amount from the cloud fraction data, as well as estimate cloud liquid amount from the cloud water content data. Then a cause of linear precipitation zone can be estimated.  Fig. 3, 4, and 5 show the weekly averages of water vapor, cloud water content, and cloud fraction of the Japanese vicinity derived from AMSR during from June 26 to July 11, 2020, respectively. From the South-West (South China Sea), there is a strong moisture (more than 60 mm) supply to the Baiu front situated in Japanese vicinity as shown in Fig.3 (b) and (c). As shown in Fig. 6, Baiu front is situated along with the Japanese island. Therefore, southern part of Baiu front had a severely strong heavy rain as shown in Fig.1 (i), for instance. Just before the line precipitation zone was formed (June 26 to July 3, 2020), there was strong moisture supply. This is one of causes of the severe heavy rainfall in the Japanese vicinity during from July 4 to July 11, 2020.

C. Causes of the Linear Precipitation Zone
On the other hand, cloud water content was more than 1000 g/m 2 during from July 4 to July 11, 2020 as shown in Fig. 4. (a). This is another cause of the heavy rainfall. Cloud patterns are same as Fig. 5 of cloud fraction correspondingly. These continuous heavy moisture supply and high cloud water content for more than two weeks (during from June 16 to July 11) are causes of the line precipitation zone and severe heavy rainfall during the period.
The rainy season front is stagnant, and local heavy rain continued July 7th in northern Kyushu. Oita flooded the Chikugo River, a first-class river that flows across Fukuoka Prefecture, causing major flooding and flooding in Omuta City.
Through checking the sky condition at Omuta City at around 8:00 am on July 7, it was said that the entire area of Kamiyashiki-cho, Omuta City had already been flooded over a wide area. There were fire engines and multiple passenger cars that had already been flooded to the waist level of adults and were stuck in the water.

D. Line Precipitation Zone of Flood Area Detection
Flood areas due to the line precipitation zone which was occurred during July 4 to July 10, 2020 were detected in the northern Kyushu, Japan with Sentinel-1 of SAR data. Fig.7 shows the detected flood areas. Fig.7 (a) shows Sentinel-1 of SAR imagery data of Omuta and its surroundings which is acquired on July 4, 2020 while Fig.7 (b) shows that acquired on July 10, 2020. By comparing both, it can do done to find the flood areas. The intensity of SAR imagery data indicates the backscattering cross section of the ground surface. The dark areas in the July 10 image comparing to that of July 4 imply the flood areas. In particular, the southern portion of Miyama city, there is the relatively large wide flood areas.  It is found that the proposed retrieval method allows retrieval of the appropriate remote sensing satellite data with the keyword of research purpose, location and time information. One of the applications of the proposed method for the following example, (1) Keyword: "Line Precipitation Zone in Kyushu, Japan in July 2020", (2) the remote sensing satellite data of MTSAT, AMSR, Rain Radar, Sentinel-1 SAR and the open data of Meteorological map, Topographic map data as the retrieval result. This is because of the proposed method has relation between research purpose (keyword) and the information of discipline, data type, and physical meaning of the data.
Prediction of linear precipitation zone with remote sensing satellite data is capable with the proposed method. The linear precipitation zone causes extremely severe storm and flood damage, landslide, and so on. Linear precipitation zone is formed in the case that the warm, moist air must continuously flow in, the force to lift this air up and it is often in collisions with mountain slopes or cold fronts, the atmospheric conditions are unstable, and there is a certain direction of wind above the sky. These conditions can be monitored by remote sensing satellite data.
The proposed method is intended to attempt for prediction of the linear precipitation zone for disaster mitigation. There are water vapor data, cloud liquid data, cloud fraction data, and upper atmospheric wind data derived from the remote sensing satellite-based mission instruments. Through experiment in the case of the linear precipitation zone which was occurred in northern Kyushu, Japan in the begging of July 2020, a possibility to detect the linear precipitation zone was confirmed. Also, flooding damages and other disasters occurred in the northern Kyushu, in same time period caused by the detected linear precipitation zone is detected with Sentinel-1 of SAR data.
Currently, the proposed method has a relation information between research purposes and the related remote sensing satellite data. This is the limitation of the proposed method.

VII. FUTURE RESEARCH WORKS
Further experimental studies are required for the validation of the proposed method. Also, applicability of the proposed method must be confirmed through further experiments. Furthermore, the relation information must be expanded to the other research fields. Then the proposed method can be applicable to the other research fields.