Analysis of Airport Network in Pakistan Utilizing Complex Network Approach

Field of complex network covers different social, technological, biological, scientific collaborative work, communication networks and many others. Among these networks, transportation network is an important indicator to measure the economic growth in any country. In this study different dynamics of Airport Network in Pakistan are analyzed by the complex network methodology. Dataset of air transportation has been collected from Civil Aviation Authority of Pakistan (CAA) and formatted to accomplish the complex network requirements. The network is formed to observe its different properties and compare these with their topological counterparts. In this, network nodes are represented by Airports of Pakistan while flights between them within a week are considered as edges. The behavior of degree distribution is observed as preferential attachment of nodes, which represented that few nodes are controlling overall network which emphasizes that Airport Network in Pakistan (ANP) follows power law. Clustering coefficient displayed the network as a clustered network. Result of short average path length highlights that Airport Network in Pakistan is small-world network. Study also signified the average nearest neighbour degree node, which explained that ANP exhibited disassortative mixing in nature which states that high degree nodes (airports) tend to connect to low degree nodes (airports). Interestingly, is has been observed that it is not necessary that the most connected node is also the most central node in degree centralities. Keywords—Transportation network; Airport network analysis; Complex network; Scale-free network


I. INTRODUCTION
From various complex systems, a lot of real-world problems can be observed as network and thus can be analyzed by brief understanding of their network structure, and can be solved in a similar way.These real world network, including neural network, food webs [1], the World Wide Web, railway network, air routes network, scientific correlation network and network of diseases [2], [3] are not simple random networks.The scale-free networks and small world networks came in the last centuries and a lot of real world problems [4], [5] seem to relate the characteristics of these two networks [6].Transportation infrastructure plays a vital role in the development of any country's economy and thus needed a special attention in order to maintain and improve economy of the country.Transportation does not only include people's travelling but import / export of goods and information as well.Understandings of this network are important for the reason of policy making, administration, efficiency and also provide convenient and safe flights to people and to detect the airports which can cause major effect to the network if they are attacked or removed by any incident.Apart of the entire real world problem, transportation infrastructure network is one of the important research directions for the researchers.Different researchers [7] have their polls related to airport networks like [8] [9].In our study, the domestic airports in Pakistan have been considered and analyzed as complex weighted networks.This study investigated to find out if Airport Network in Pakistan (ANP) obeys any of the complex networks models ( Scale-free , small-world, random network) by obtaining its shortest path, clustering coefficient moreover the importance of node can be found out by analyzing more advance properties like betweenness, closeness.The behavior of nodes can be observed by the correlations and nearest neighbour degree of all the nodes.

II. LITERATURE REVIEW
The study of transportation infrastructures by complex networks technique is not a new thing.During the past couple of years, complex network analysis has been used to analyze transportation systems whether it's a railway, highway or airway [10].Apart of differetnt studies, World-wide Airport Network (WAN) which were studied by topological point of view as well as its dynamic prespectives.WAN has been analyzed by degree distribution and constraints (cost of adding new links to the nodes) was proposed to represent the truncation of high degree nodes in its scale-free degree distribution [8].Different authors have studied different networks on the basis of two main features, Scale-free and Small world phenomenon.This paper observed these two concept in Airport Network of Pakistan.Most of the characteristics and properties which were studied such as Degree Distribution, Clustering Coefficient, Degree Correlation, Betweeness, Closeness [11], [12], Centralities.Complex systems can be studied in view point of different www.ijacsa.thesai.orgnetwork models like random network, small-world networks, scale-free networks [13], [14].
For the topology of a network, the degree distribution is an important feature.The links between two nodes in a network are assigned randomly have a Poisson degree distribution with many of the nodes having typical degree [15], whether it is a Worldwide Airport Network (WAN), Airport Network of China (ANC) [16], Airport Network of Italy (ANI) [17] or Indian Airport Network (IAN) [7], or Australian Airport Network (AAN) [18].Degree distribution can be found in all of these.
Clustering coefficient of any network represent if there is any colonies/clusters present in the network or not.It's values will be between 0 -1 where 0 means no colonies at all and 1 means densely formed cluster or colonies in which every node (airport) is connected with every other node (airport).If the clustering coefficient is 0.5 as in the case of AAN [18] and if 2 airports were randomly chosen then there is a probability that the neighbors are connected directly is 50% whereas, with the same size small-world and random network the possibility is 63.50% and 9.1% respectively.The high clustering coefficient of AAN confirmed a large amount of concentration and also suggests the high probability of transfer with less connecting flights.The hierarchical networks are expected to have nontrivial, power law decay [19].
Another important property of a network with respect to its topology is its degreedegree correlation between connected nodes.If high degree nodes connect with other high degree nodes throughout the network then the network is said to be assortative in nature.Similarly if low degree nodes connect with high degree nodes then the network is said to be disassortative.The degree correlation can be done using the average degree of nearest neighbor, for the k weighted degree nodes.

∑
(1) Where signify that the weights of edge are uncorrelated with the degree of i's neighbours.The heavily weighted edges connect to larger degree neighbours, if the weighted neighbour degree is greater than simple degree neighbour i.e ( ) and the opposite occurs when the weighted neighbour degree is less than simple degree neighbour ( ).
The importance of nodes do not depends only on the above discussed properties but also some centrality measures helps in this way.For instance, betweenness centrality will show which airport comes in between from departure to destination whereas closeness centrality will tell which individual airport is near all other individuals in a network.Sameer Alam and Murad Hussain created a table of top 20 cities (airport within cities) of Australia according to their degree, closeness and betweenness [18].
According to Fig. 1, Brisbane is at top position with respect to closeness and betweenness and Sydney regarding degree which is followed by Brisbane.Melbourne is at 3rd position w.r.t to degree and closeness but regarding betweenness its ranked decrease to 7th which shows much deviation in centrality of AAN.The capital of nation, Canberra, is at 9th and 10th for degree and closeness correspondingly but is not at top 20 with respect to betweenness.According to Sameer and Murad, there are 28 airports with minimum degree (degree of 2).All these airports contains a little amount of air routes to their regional hub which means to get to other airports it sometimes take multiple connecting flights within the network.Moreover their analysis shows that AAN has almost 70 airports with betweenness centrality of Zero which reveals that between other pair-airports there is no shortest path passes through them.Fig. 1 also illustrates that 10 cities which are at the top are highly connected and plays a vital role in transportation.Apart from Australia, the centrality measures were also performed by others such as Gudia and Maria [18].They performed betweenness measures for three different period and randomized IAN appropriately.According to them, the normalized betweeness is simply follows the quadratic function of degree but the Italian Airport Network normalized into three different region, the first region comprises with small values of betweenness and degree and second region comprises with small values of betweenness but large values of degree while the third region comprises with large value of both attributes.When they plot this on a graph they found that the airport that belongs to the 2nd and 3rd region is making a tail shown in Fig. 2.And the region with small degree and large betweenness turns out to be empty.

A. Data Gathering and Transformation
The dataset has been collected from the Civil Aviation Authority (CAA) of Pakistan that contains the records of domestic flights in Pakistan.It includes the flight information from March to July 2016.For instance Flight 203 flies from Karachi to Islamabad on this and this day of the week.The data was not in a normal readable format and needed to organize in an appropriate format to analyze it as network.The 857 * 7 data was transformed into an adjacency matrix to make a network out of it.There are total of 44 Airports in Pakistan but some of them are not operational and only used for military purpose.The Airports are categorized by their IATA Code (International Air Transport Association) so the data is gathered was in a format as shown in Fig. 3. Fig. 4 shows IATA codes, for example KHI is the IATA code of Jinnah International Airport.Fig. 5 shows the weighted adjacency matrix obtained by the given data.Fig. 6 represents the links between Airports of Pakistan.There are different tools that may be utilized for network analysis, like Gephi, R-project, NodeXL, etc.In this study we used R-project for the analysis purpose.It is standard and open source tool which is used for manipulating and managing of data.

B. Network Construction
The Network of Pakistan has been formed utilizing domestic airports of Pakistan and flights connecting to them.The un-operational airports are not considered in construction of network.In this network nodes (Airports of Pakistan) are represented by N while flights within a week are represented as edges (E).Twenty four Operational Airports are considered as nodes (N = 24) in this network and directed edges are 84 (E = 84).Weighted network has been constructed to see the dynamic properties of this network.Weight on edges between two corresponding airports is represented by the number of flights per week.This type of representation was also used to demonstrate different airport networks of different countries.For the analysis purpose, we have utilized the different network matrix, such as degree, average shortest path, clustering coefficient, centrality measures and degree correlation [20] [21] [22] [23].Fig. 7 is the representation of ANP.Knowledge of information flow is significant factor for transportation networks.Here, -weighted ANP‖ is used to calculate the information about the quantity of traffic flows within a network.The weighted ANP can be defined by the weight on each link in term of number of flights travelling per week.The weighted Matrix, W (N x N), is maintained to store the weight on every links.

IV. RESULTS AND DISCUSSION
There are approximately 139 airfields in Pakistan including 44 Public Airports and 17 Military Airports but our focus is only in public airport in this thesis and within 44 airports, some of them are not operational during the summer of 2016 (they might be operational in near future) so we have neglected those airports and focus on just the operational Airports which are 24, having 84 weighted Edges means flights arriving and departing from airports.As a comparison with other network studies ANP has quite a low nodes (Airports) WAN, CAN, ANI, IAN, AAN.

A. Weighted Clustering Coefficient Findings
The average clustering coefficient of ANP is quite higher than the Erdos and Renyl random graph, [24] (C RE = 0.05).It is also higher than the Italian Airport Network.The weighted clustering coefficient is plotted in Fig. 8.If (C w = C) we can see this in our case.This means that the weights are uncorrelated to form clusters.Where C w greater or less than C will consider the role of weights forming the clusters.In our case C w is approximately equal to C. Small value of Average Path length with the high value of Clustering Coefficient will take the network toward the small-world network perspective.This case seems to fit in our network.So it's mean that the ANP is a small-world network and with a scaling factor of 1.6 and the presence of hubs we can state that ANP is scale-free network.

B. Centrality Measures Findings
All three centrality measures (degree, betweenness, and closeness) generally confirm to the power law decay function.The degree centrality curve confirms that the few airports have large number of degree and control the 80 percent of overall network.It can be observed from Fig. 9 that closeness centrality have a flat curve while the curve of betweenness clearly shows the present of few hubs in the network.In General, high degree nodes have high topological connections, usually high value of betweenness.But this is not always the case.We can see in Fig. 10 that some nodes even with low degree have betweeness higher than their degrees and some of small degree nodes turn out to be empty betweenness.We can see most of the hubs have higher betweenness including Karachi, Islamabad, and Lahore.The decline in betweenness curve and large number of low betweenness nodes suggests the existence of bottlenecks in ANP, as confirmed by the low value of clustering Coefficient (C = 0.237).

C. Relationship between Centralities
Like Sameer and Murad in AAN, to show the relationship between centralities I have created a table in which airports were ranked with respect to their degree, betweenness and closeness.Table 1 shows Top 5 Airports with respect to degree, betweenness and closeness.
By observing the Table 1, Karachi being at 1 st rank in degree and betweenness but being 2 nd in closeness likewise Islamabad is almost the same maintaining its 2 nd position but dropped to 4 th with respect to closeness.Zhob being 1 st in closeness.This is an interesting feature of ANP, while observing network Zhob Airport doesn't catch eye but through deep studying we found out that it is quite important airport in ANP having 1 st in closeness and 3 rd in betweenness.

D. Weighted Degree Correlation
Degree correlation is the relationship of neighbor nodes with degree k, i.e.Knn (k) which is also called nearest neighbor degree [25].The result of Knn (k) determines either a network is assortative or disassortative mixing.To analyze the weights of connections, the weighted nearest neighbor degree (k) is used to calculate the effective affinity for high and low-degree neighbors connection according to the amount of traffic interactions.Fig. 11 also shows that the topological (k) and (k) have clear disassortative behavior and that the linearly decreases of (k) proves that.This may be because of the political pressure to connect hubs with low degree nodes to improve the connectivity.Furthermore, different network properties of ANP which have been calculated in this study are shown in Fig. 12.

V. CONCLUSION
Means of transportation remained important in the development of every country.Mostly, three means of transportation are used (i.e.sea, air and land) for domestic and international travelling.Network of transportation is a tool to measure the economic growth of any country.Researchers have been utilizing complex network methodologies in order to analyze the effectiveness of transportation networks.In this study, we formalized the Airport Network of Pakistan and analyzed it in view point of complex network.Airports were considered as nodes and connecting flights were considered as edges.Connections are maintained between airports on the bases of flights (departing and arrival) within a single week.Numbers of flights travel within a week are considered as their weights.Network among Pakistani airports is formed with 24 nodes and 84 edges.Although it is a small network compared to the World Wide Airport, Chinese Airport Network, Indian Airport Network and Italian Airport Network.But still it exhibited complex dynamics identical to those of air transportation networks.The diameter of this network is 3 and the average path length is 2 that show any traveler required minimum 2 and maximum 3 flights to move from one airport to any other airport within Pakistan.The network is a moderately clustered and its degree distribution is designated by power law function which directs the existence of high degree nodes (hubs).Specifically Quetta, Karachi and Islamabad have more resemblance.High clustering coefficient and shortest path length depicted that this is a small world network.Furthermore, presence of power law and preferential attachment, evidence that Pakistani airport network is a scalefree network as well.The hubs are surrounded by the low degree airports.Moreover, results from Betweenness and closeness centrality analysis vindicated that the high degree node is not always the most central node.Network features also showed the disassortative behavior in this network which means that hubs tend to connect to low degree nodes.For example Zhob Airport by analyzing the topology doesn't seems much important airport but we know through centrality analysis that the betweenness of Zhob airport is 2 nd highest throughout the network.Karachi airport has high connections but still Zhob is the most central airport in term of centrality results.
In future, it will be interesting to further study this network regarding cargo or number of passengers per flight; it will provide understanding of more complex dynamic features of ANP.Different research mechanism can be enforcing to the airline operating in Pakistan by correlate the results to their counterparts.This type of research will find out the limitations www.ijacsa.thesai.organd confidently leave room for improvements in the current aviation industry of Pakistan.With the addition of several lowcost airline services like serene air, the ANP is expected to grow in a fast pace in terms of coverage and frequency of flights.

Fig 3 .
Fig 3. Raw Data Gathered by Civil Aviation of Pakistan (CAA).

Fig 4 .
Fig 4. Airports of Pakistan Represented by their IATA Code.

Fig 5 .
Fig 5. Weighted Adjacency Matrix of Airport Network of Pakistan.

Fig 6 .
Fig 6.Link between Airports of Pakistan.

Fig 8 .
Fig 8. Correlation between Degree and Weighted Clustering Coefficient.