Risk Diffusion Modeling and Vulnerability Quantification on Japanese Human Mobility Network from Complex Network Analysis Point of View

The human mobility networks are vital infrastructure in recent social systems. Many efforts have been made to keep the healthy human mobility flows to maintain sustainable development of recent well-connected society. However, the inter-connectivity sometimes raises unintended diffusion and amplification of the intrinsic risks, which is difficult to forecast because of the complexity of the underlying networks. Therefore, it is believed that modeling and simulation of the risk diffusion in the human mobility networks are suggestive and meaningful. Also, recent improvement of usability of individuallevel human mobility data and capabilities of high-performance computing technologies enable us to employ the data-driven approaches. In this paper, the risk diffusion dynamics is modeled based on the SIS epidemic model and the vulnerability index is defined to quantify the node-level easiness of suffering risks. We also conduct the link removal test to find the better risk mitigation methods. Keywords—Human mobility network; Risk analysis; Complex network analysis; SIS model


INTRODUCTION
Analyzing the human mobility networks is essential for making economically efficient and socially resilient social systems.It can be applied to various fields such as urban planning, public policy, and epidemic control.Maintaining the healthy flows on human mobility networks is also vital for sustaining modern society.At the same time, the complex connectivity and mobility in our society bring intrinsic risks.For example, the outbreak of Ebola virus disease from 2014 in West Africa has spread through the global human mobility networks to five other countries in the region and a few cases were found even outside the continent including several countries in Europe and the U.S. Because of the various factors, such as increment of the number of travelers, longer distances of their trips, diversification of the means of transportations, urbanization, and uneven distribution of population, the analysis of the actual dynamics on the human mobility networks becomes more difficult.Therefore, modeling and simulation of the risk diffusion on the human mobility networks are the practical options and meaningful approaches.Also, recent improvement of usability of individual-level human mobility data renders researchers' interests toward the data-driven approach, which is also promoted by the dramatical improvement of capabilities of high-performance computing technologies.Recent efforts to open the governmental statistical data to public enable us to utilize highly credible and authorized dataset.
So far, there exist a large number of previous works which analyze regional human mobility networks based on real data.For example, about ground transportation networks such as railway network and subway network, Latora and Marchiori (2002) analyze the subway network in Boston [1], Sen et. al. (2003) studied Indian railway network [2], Sienkiewicz et al. (200) analyze the public transportation networks in Poland [3], De Montis et al. (2007) examined the inter-urban commuting network among the 37 municipalities in the Sardinia region, Italy [4], Li et al. (2007) investigated Chinese railway network [5], and Soh et al. (2010) studied the bus and railway network in Singapore [6].About the airline networks, many researchers have analyzed the world airline network (WAN) from the complex network analysis point of view [7][8][9][10] as well as for regional or domestic airport networks [11][12][13].The analysis of the human mobility networks show several commonly shared characteristics.For example, as shown in most of the complex network analysis results, the weight distribution as well as degree distribution in large-scale networks tends to show scalefree property.Also, some studies reported that the networks show small world property, such as in the studies on airport network [7,9].
In this paper, the analytical results of the risk diffusion dynamics on Japanese domestic human mobility network (JDHMN), Japanese airline network (JDAN), and world airline network (WAN) using the real human mobility data are stated.Our analysis can be applied to various fields such as diffusion of disease, rumors, and political disturbance which are spread along with the human mobility.The human mobility networks are often applied to the researches of meta-population epidemic diffusion model in which the human mobility networks, especially WAN, are utilized as paths connecting with the areas where the epidemic happens [14][15][16][17][18][19][20][21][22][23].Moreover, K.J. Mizgier et al. (2013) recently applied the probabilistic approach to model the diffusion of the influence caused by the disruption in the supply chain and conducted the vulnerability assessment in the supply chain analysis [24].
In this paper, the link removal test in JDHMN, JDAN, and WAN for efficient control of risk diffusion are implemented, which imitates restricting the human flows on certain routes instead of restricting traffics at certain nodes.This approach www.ijacsa.thesai.orgcan assess the influences to the risk diffusion dynamics when shutting some routes in the human mobility networks.There exist some related works about the link removal tests.Marcelino et al. (2012) evaluate the effect of the link removal in the top 500 international airports network when they simulate the SEIR meta-population epidemic model using the actual data of the outbreaks of influenza A (H1N1) in 2009 [2].Hossain et al. (2013) investigate the resilience of the Australian Airports Network from a complex network analysis point of view [13].They examined the resilience measures including the size of the largest component and reachability when closing the links following the certain ranking criteria of the links.Also, Verma et al. (2014) investigated the connectivity in the WAN as removing links following the ranking based on their weight.They found that the WAN is resilient when the links having large weight are cutting down since these links tend to connect between high degree nodes and there are many alternative paths between them.Conversely, they found that the closing the links having low weight damages the connectivity of the network [26].
In the rest of this paper, the chapter II introduces the datasets for the JDHMN, JDAN, and WAN.Also, the networks created from the datasets are analyzed from the stand point of the complex network analysis.In the chapter III, assuming the risks are propagate probabilistically through the links of the networks, the risk analysis on these human mobility networks are discussed.Also, the link removals for the risk mitigation are investigated to propose the effective risk mitigation methods.Finally, the chapter IV concludes this paper.

A. Japan domestic human mobility network
In this work, the openly available public datasets which are published by the Ministry of Land, Infrastructure, Transport and tourism (MLIT) of Japan are utilized.The datasets include the amounts of passengers passing the borders between prefectures in Japan with several means of transportations, such as cars, trains, busses, airlines, and ships (http://www.mlit.go.jp/sogoseisaku/soukou/sogoseisaku_souko u_fr_000008.html).The datasets include the information of the passengers' purposes of travels, genders, and ages.The values of the human mobility data are estimated based on the results of the questionary based random sampling surveys to the passengers and the activity reports provided from the operators of the transportation services.In order to consider the overlapping when integrating the number of passengers of multiple means of transportation, the dataset is adjusted based on the credibility of the data.Also, the dataset does not include the human mobility data between the prefectures within the Kanto region (Tokyo, Kanagawa, Chiba, and Saitama), Kinki region (Osaka, Kyoto, Hyogo, and Nara,), and Chukyo region (Aichi, Gifu, and Mie).These three are largest regions in Japan which are the critical areas for our analysis.Therefore, the lacking data for these regions are complemented by another public dataset (http://www.mlit.go.jp/k-toukei/cgibin/search.cg).This dataset only includes the passengers' travel data counted by the transportation service operators.Fig. 1 shows the visualization of the JDHMN.The locations of the nodes indicate the locations of the local governmental offices of each prefecture.The connections between these nodes represent the human mobility flows between the connecting prefectures.The color, width, and transparency of the links represent the relative significance of the volume of human mobility.

C. Airline networks; Japanese domestic and world airline network
Firstly, the JDHMN are analyzed from the complex network analysis point of view.Fig. 3 shows the distributions of the link's weights (i.e.amounts of people) of JDHMN.Since, if we do not consider the weighted on the links, the network structures are almost complete network, the distributions of the links' weights in each network are investigated.In these figures, the red plots represent the frequency for every 10,000 travelers.The fitting lines in the figures were computed utilizing the data between the links' weights from 0 to 10 6 people which show the power law distributions with the power exponents within the range of 1.3 ± 0.1.
Next, the community detection which finds the hidden community structures (i.e. the clusters of densely connected components) in the networks are conducted.www.ijacsa.thesai.orgWe utilized the infoMAP algorithm [27] for the community detection.The infoMAP algorithm solves the problem of optimally compressing the information of a random walk occurs on the network.The optimally compressed information can be recovered to the original information as closely as possible when the compressed information is decoded, which can be considered as the problem of finding the optimal partition of the clustered structures.The infoMAP algorithm is applicable into detecting community structures in the directed and weighted networks.Also, it is reported that the best performance of community detection is attained in the comparison tests on the benchmark networks comparing in the several well-known community detection algorithms [28].
Table.1 shows the results of the community detection on JDHMN, JDAN and WAN.The first row of the table shows the numbers of the detected communities in each network when the weighted modularity Q w for each network is maximized.In the community detection algorithms, the partition of the network is optimized so that the value of Q w is maximized.Therefore, higher the value of the optimal modularity Q w , better the partition of the network is.The weighted modularity Q w can be computed by the following equation [27], where w indicates the total weights of the links in the network, w ii represents the total weights of the links in a detected module i, is the total weights of the inward links coming into the module i, and is the total weights of the outward links departing from the module i, and m denotes the number of the detected modules.It is generally known that, when the optimal modularity Q w > 0.3, the network can be considered having community structures.Therefore, as can be seen in Table .1,the JDHMN and WAN have the community structure, meanwhile, JDAN shows almost zero optimal modularity Q w , which means JDAN does not have community structure.Fig. 4 shows location of the communities which was detected from the networks of the full means of transportation in the JDHMN.The each colored region on map corresponding with 10 communities which are detected by the community detection shown in Table .1This separation is well fit with the traditionally used regions of Japan, such as Tohoku region, Kanto region and Kinki region etc., which is another evidence The network assortativity is one of the conventionally used network properties to quantify the tendency that each node in the network tend to connect to the nodes which have similar degree, which can be computed by a correlation function of degrees of nodes connected each other [29].When the network assortativity is positive, the nodes in the network tend to connect with the nodes with similar degree, meanwhile, when it is negative, the nodes in the network tend to connect with the nodes having different degree.
The following equation is the definition of the network assortativity r [29], where F(ϕ) represents the set of connecting pair of two nodes, ϕ denotes the identification number of links connecting the pairs, M denotes the number of links in the network, k i denotes the degree of the node i.Then, this concept was extended to the weighted assortativity [30] as defined below, where ϖ ϕ denotes the weight of ϕ th link, H represents the sum of the weights in the network.[31], and improved by themselves by removing a bias towards low-degree nodes as unbiased local assortativity [32].The concept of the unbiased local assortativity is to calculate the contribution from each node in network assortativity defined by Eq.( 2) so that the sum of the values of the local assortativity of each node is equal to the network assortativity.
Eq.( 4) shows the definition of the unbiased local assortativity of node i [32], where M represents the number of links in the network, the excess degree j means the number of links except the link which is used to reach the nodes, therefore it can be calculate d i -1 when d i denotes the degree of node i, ̅ denotes the means of the excess degree, σ q represents the standard deviation of the network's excess degree distribution q(k), and μ q denotes the expectation of the excess degree distribution q(k).
However, the unbiased local assortativity ̂ is a relative measure, and it can quantify the relative contributions of each node to the network assortativity.Therefore, it works only when comparing within the focal network.Only considering the local value itself on a node, we cannot judge the node is assortative or disassortative.Therefore, recently, Thedchanamorthy and Piraveenan et al. (2014) proposed alternative approach to compute the local assortativity [33].
Their new definition of the local assortativity δ i is as follows, where λ is a scaling factor computed as follows, where r denotes network assortativity derived from Eq.( 1), and N denotes the number of nodes.̅ in Eq.( 7) is the relative average neighbor difference which computed as follows, where d i denotes the degree of node i. d j represents the degree of neighbor node j of node i.The relative average neighbor difference means the relative values that quantify the relative dis-assortativity of each node so that the sum of ̅ = 1.This concept of the local assortativity δ i is extended to the weighted network and proposed the local weighted assortativity .The proposed definition of the local weighted assortativity is as follows, where the scaling factor λ w can be calculated by just using weighted network assortativity r w instead of the normal network assortativity r as follows, and the relative average neighbor difference in terms of weight, ̅ , can be computed as follows, where represents weight of the inward link from the neighbor nodes j of the node i to the target node i.The node k represents the neighbor nodes of the node j The locations of the nodes in Fig. 5 are corresponding with the locations of the local government offices in each prefecture in Japan.As can be seen in these figures, in the JDHMN of the full means, car, bus, and train and JDAN, remarkably large and thick blue circle locates at Tokyo, which indicates that Tokyo is the hub in which a large number of travelers' mobility concentrated on and the average difference of travelers' flows between the neighbor prefectures' is extremely large.
In the JDHMN of the full means, car, bus, and train, Osaka and Aichi are the second and the third largest hubs.In ship networks, the hub-like nodes are located at several prefectures in west part of Japan as well as around Tokyo.

A. Risk diffusion model
In this section, the risk diffusion in the human mobility networks is analyzed.To model the risk diffusion in the networks, the epidemic model (the susceptible-infectedsusceptible (SIS) model) [34][35][36][37][38] is applied.
It is assumed that risk propagates through the links, and the diffusion probability on each link is proportional with the relative weights of each links.
The risk diffusion probability matrix M is defined as follows, where the (i, j) element in the relative propagation probability matrix W can be computed by the link weight w i,j from node i to node j divided by the maximum link weight w max as follows, Then, the risk diffusion probability from node i to node j is given by the β×w i,j ⁄ w max .Here, β is the weighting parameter to control the significance of risk transition, and δ represents the risk removal probability which is set to a fixed value.

The risk propagation dynamics over time is them given as follows,
where p(t) is the risk probability vector in which the i th element represents the risk probability of node i at time t, and p(0) denotes the initial risk probability vector.

B. Vulnerability index
The vulnerability index (VI) which can quantify the nodelevel easiness of being suffered from the risk is proposed.The VI is defined by the accumulative probability of being suffered from the risks when each node is selected as the initially infected node in turn.The following equation shows the equation to compute the VI vector v VI , where e denotes the all-one vector which represents the initial attack assigned to all nodes in the network, namely p(0) = e.Then, to remove the influences from the initial attack to all nodes and consider the differences of the size of the network, the normalized VI vector v' VI is suggested as follows, However, Eq.( 14) can work only when the risk diffusion probability matrix M are sufficiently small and converge to zero matrix at the infinite time.Therefore, the relative vulnerability index ̂VI is proposed as follows, where T is a sufficient large positive integer and T = 100 is used for our computations.̂ is a matrix in which each element of M x is divided by the sum of the elements m i,j of M x as follows, We compare the relationships between the relative vulnerability index ̂VI and the weighted in-degree (i.e.node strength) vector w in , weighted out-degree vector w out , weighted eigenvector centrality vector , weighted closeness centrality vector , weighted betweenness centrality vector , and weighted PageRank vector .w in is a vector in which the i th element consists of the sum of the weights of the inward links to the i th node.w out is a vector in which the i th element consists of the sum of the weights of the outward links directing from to the i th node.
is the largest eigenvector of the weighted matrix.
considers the path with the minimal weights as shortest path, and computes the inverse of the sum of the weights of the weighted shortest path.c is a vector in which the i th element considers the path with the minimal weights as a shortest path and computes the relative value of the number of the weighted shortest paths which pass through the node i.
uses the weighted matrix to compute the transition probability matrix M, and the weight on a link between node i and node j, w ij , is used instead of a ij of the adjacency matrix.Table .3shows the correlation coefficients between the six weighted centrality measures and relative vulnerability index v VI for each human mobility networks, JDHMN, JDAN, and WAN.As can be seen in this table, shows very high correlation with ̂VI .Fig. 7 shows the correlation diagram between the relative vulnerability index ̂VI and the weighted eigenvector centrality in the seven human mobility network when β=0.5 and δ=1.The red circles represent the node having the positive local weighted assortativity and the blue triangles represent the nodes having the negative local weighted assortativity.The size of the plots indicates the absolute significance of the local weighted assortativity of each node.Since the weights of each nodes are basically different (except the special case like the homogeneously weighted network), the positive local weighted assortativity means that the averaged absolute difference between the weight of the target node and the weights of its neighbor nodes is relatively small, which is achieved when the target node is connected to both of higher and lower weight neighbor nodes evenly, or when the weights with connected neighbor nodes are similar, which is sometimes occur at the (11) (12) , ,   t www.ijacsa.thesai.orgperipheral nodes.As can been seen in Fig. 7, ̂VI shows positive correlation with the weighted eigenvector centrality .
As can be seen in these figures, the nodes having high relative vulnerability index and high have very large negative local weighted assortativity.

C. Mitigation of risk diffusion
How to control risk diffusion in real networks is important concern.In this section, the effects of the link removals as the methods to mitigate the risk diffusion are investigated.The effects when links in the human mobility networks are cut based on a specific ranking strategy.Some approaches have been proposed to control the risk diffusion dynamics in networks.The node removal approach is straightforward to apply the real world situation, for shutting down an airport in the airline network, but this approach needs to close all the routes connecting to the removed node, which might not be a cost-effective manner.Therefore, the link removal approach is employed.In this section, the four strategies to rank the candidates for link removal, namely i) linkbetweenness-based removal, ii) link-weights-based removal, iii) weighted link-salience-based removal and iv) randomchoice-based removal are employed.Then, the links are cut as following one of these four ranking strategies and observe the evolution of the average of the normalized VI in each human mobility network as increasing the fraction of the link removal.
The link salience proposed by Grady et al. ( 2012) is one of the measures to quantify the relative importance of each link [39].The link salience is computed as follows, at the first step, the effective distance of each link which is an inverse of the link's weight is computed.Then, we can search the shortest pass from an arbitral node to another node as minimizing the sum of the effective distance on the path.This enables us to compute the shortest path tree from an arbitral node to all the other nodes.In this step, even if any links were chosen several times in a shortest path trees, we do not consider the overlaps of the links (i.e.every shortest path tree can be represented by the N×N adjacency matrix).After we computed the shortest path trees from every node and summing up the shortest path tree matrix, the link salience s on each link is obtained by dividing by the number of nodes.Fig. 8 shows the distribution (histogram) of the link salience in the JDHMN, JDAN and WAN.As can be seen, about 50% of links in JDAN, WAN, and JDHMN of ship are found to be no salience with 0, and about 20% of links have salience of 1.This means that, in these networks, the 20% of links are always chosen in the shortest path trees, meanwhile 0% of links are not used at all.In the networks of JDHMN-Full, JDHMN-Car, JDHMN-Train, and JDHMN-Bus, about 90% of links has the link salience of 0 and only a few percent of links shows the link salience of 1, which means that only a few nodes with high salient links are very critical to attain the effective connection in the networks.Fig. 9 shows the distribution of link salience on a map.The color gradation of the links shows the significance.Then, the evolution of the averaged normalized vulnerability index v' ave with each link removal strategies is examined as changing the link removal ratio from 0.01 to 0.09 by 0.01.Fig. 10 shows the comparison results of the link removal tests for JDHMN and JDAN when β=0.5 and δ=1.

Fig. 1 .
Fig. 1.Visualization of the weighted network of Japan domestic human mobility (JDHM) derived from the openly available public data.The locations of the nodes in these networks are corresponding with the locations of the prefectural governmental offices.The link between the nodes represents the human mobility between the prefectures.The color, width, and transparency of the links represent the relative amounts of the human mobility on the link B. Airline networks; Japanese domestic and world airline network In this work, the public open datasets of the JDAN in 2014 are used.The datasets are published by the MLIT annually and they contain the information of monthly and yearly flight number, the number of passengers, and weights of freights.We also analyze the dataset of the WAN whose data is provided by openflights.org(http://openflights.org/data.htm)as open data.The datasets include the data of the world-wide airports network and the information of the airline company operating the routes.The largest strongly connected cluster from the WAN which consists of 3,34 airports are extracted.Fig.2 (a) shows the network structure of JDAN where the relative number of passengers is used to weight the links.Fig.2 (b) is the structure of WAN from openflight.org.

Fig. 2 .
Fig. 2. Visualization of the airline networks; (a) Japan domestic airline network (JDAN) in 2014 in which the color, width, and transparency represents the relative number of passengers and (b) world airline network (WAN) from the website of openflights.org(http://openflights.org/data.html)

Fig. 3 .
Fig. 3. Distribution of the links' weights of several JDHMN; (a) for full means of transportation, (b) cars, (c) trains, and (e) bus.The red plots represent the frequency for every 10,000 travelers.The linear regression curves show the links' weight distributions follow power law with the power exponent within the range of 1.3 0.1

FrequencyFig. 4 .
Fig. 4. The ten community structures detected from the network of the full means of transportation of JDHM by the infoMAP algorithm.Each colored region corresponds with the detected communities of prefectures.The black solid separation line on the map represents the borders between the prefectures

Fig. 5
Fig.5 shows the distribution of the local weighted assortativity for the JDHMN and JDAN, and Fig.6 shows distribution of for WAN.The blue circles indicated the nodes which have the negative local weight assortativity, meanwhile, the red circles show the nodes with the positive local weight assortativity.The size of these circles indicated the significance of the node strength (i.e.weighted indegree) and the transparency represents the significance of the absolute values of the weighted local assortativity .

Fig. 5 .Fig. 6 .
Fig. 5. Distribution of the local weight assortativity for JDHMN for several means of transportation and JDAN

Fig. 7 .
Fig. 7. Relationship between the elative vulnerability index ̂VI and w . in the JDHMN of (a) Full, (b) Train, (c) Car, (d) Bus, (e) Ship, (f) JDAN and (g) WAN when  = 0.5 and δ = 1.The red circles represent the node having the positive local weighted assortativity and the blue triangles represent the nodes having the negative local weighted assortativity.The size of the plots indicates the absolute significance of the local weighted assortativity of each node

TABLE I .
RESULTS OF THE COMMUNITY DETECTION OF JDHMN AND JDAN AND WAN BY INFOMAP ALGORITHM

Table . 2
shows the weighted assortativity r w of the JDHMN, JDAN, and WAN.

TABLE II .
WEIGHTED ASSORTATIVITY OF THE NETWORKS