Skyline Path Queries for Location-based Services

A skyline query finds objects that are not dominated by another object from a given set of objects. Skyline queries help us to filter unnecessary information efficiently and provide us clues for various decision making tasks. In this paper, we consider skyline queries for location-based services and proposed a framework that can efficiently compute all nondominated paths in road networks. A path p is said to dominate another path q if p is not worse than q in any of the k dimensions and p is better than q in at least one of the k dimensions. Our proposed skyline framework considers several features related to road networks and return all non-dominated paths from the road networks. In our work, we compute skylines considering two different perspectives: business perspective and individual user’s perspective. We have conducted several experiments to show the effectiveness of our method. From the experimental results, we can say that our system can perform efficient computation of skyline paths from road networks. Keywords—Skyline queries; trip planning; location-based services


I. INTRODUCTION
Given a k-dimensional database DB, a skyline query retrieves a set of skyline objects, each of which is not dominated by another object. An object p is said to dominate another object p` if p is not worse than p` in any of the k dimensions and p is better than p` in at least one of the k dimensions. Fig. 1 shows a typical example of skyline. The table in Fig. 1 is a list of five routes, each of which contains two numerical attributes-"Cost" and "Distance". In the list, R2 and R5 are dominated by R3, while others are not dominated by any other routes. Therefore, the skyline of the list is {R1, R3, and R4}. Such skyline results are important for users to take effective decisions over complex data having many conflicting criteria. A number of efficient algorithms for computing skylines from the database have been demonstrated in the literature [1,2,3,4,5,6].
Location-based services (LBSs) use positioning technology and traditional map information to furnish mobile users with new sorts of on-line services.
Location-based services in road network are becoming more popular. With rapid growth of technology, skyline queries on road networks [14,15,16,17] have attracted much attention now a days.
Traffic jam refers to a long line of vehicles stuck in a jam. It is a common problem in the big cities and towns like Dhaka city of Bangladesh. Many factors such as less number of roads, lack of modern proper traffic management systems, narrowness of the roads, and increase of vehicles are the main causes of traffic jams in cities like Dhaka. These traffic jams are creating many problems such as not reaching in time at offices, ambulance carrying patients cannot reach at the hospitals in time etc.
In such a scenario, a well-developed location-based service that focuses on the road conditions such as traffic jam, number of passengers and cost to the destination can give some comforts to the people by choosing skyline routes from which people can select their desired paths based on their preferences.
Each road in a road network has multiple-path criteria such as the distance of the road, the travel time through that road, the number of travellers and the number of traffic. The last two factors vary according to time. Before starting a journey, a traveller may want to know about the conditions of the road taken on their destination at a specific point of time. He/she may also want to know trip cost and other conditions of the roads.
In this paper, we apply skyline queries to support locationbased services for road networks. In our approach, at first, a user needs to choose his pick-up point and the destination point. Based on the choice of the source and the destination by a user, our system then finds all alternate routes from source to destination. Next, each route is represented with several features such as traffic conditions, travelling time, travelling costs, number of passengers available through that each routes etc. After representing each route with a number of features, we apply skyline queries to filter dominated routes and to return only useful routes for the users. From the return results, a user can select his desired path such as less cost path or less traffic path.
The remainder of this paper is organized as follows: Section II provides a brief review of related work. We provide motivating examples at Section III. Section IV describes different concept related to the paper. We provide detail description of our proposed approach at Section V. In Section VI we present the experimental results. Finally, we conclude our paper at Section VII. The Block-Nested Loops Algorithm (BNL) [4], which is the easiest skyline query method. Its objective is to build a candidate skyline set. This calculation investigates every data point with each other data point in the dataset. The BNL calculation requires each data point in the database be checked and tried for predominance; consequently, the time required for calculation increments with the volume of information.
The DAC calculation [4] separates information into groups and at that point leads skyline query in each group. The results are consolidated to acquire a definitive result.
The SaLS aalgorithm [2] utilizes an element acquired from the raw information as a threshold value with which to filter and dispose data points.
The BBS algorithm [18] is at present the most well-known skyline query algorithm. The BNL and DAC algorithms require that a large portion of the data points be processed all together to complete the skyline query comparison. Conversely, BBS utilizes an index structure for the identification of skyline points, which diminishes the number of points that must be tested all together to process a query.
Previous studies in which skyline queries were utilized to check road networks can be classified into those attempting to recognize skyline landmarks and those trying to distinguish skyline paths.
Deng et al. [8] presented the idea of searching for skyline landmarks in street network. The skyline landmark query recognizes landmarks that coordinate user criteria when user is going on a road network. For instance, when a user travels on a road network, skyline landmark query encourages him/her those points that are adjacent. The algorithm in this work characterizes landmark attributes as static or dynamic. Static properties have fixed values. Dynamic properties have variable attributes. The algorithm initially distinguishes static skyline landmarks on their static attribute values. At that point, when users perform to check, the algorithm distinguishes all unique skyline landmarks dependent on their dynamic attribute values, and consolidates query points with the static skyline landmarks. At last, the algorithm can recover skyline landmarks that fit all characteristics.
Huang and Jensen [13] proposed an alternate skyline landmark search concept from that of [8]. They contended that users' movement in road networks ought to be founded on a recently settled way. The algorithm in this work was like that proposed by Deng et al. [8], which utilized the ideas of static and dynamic attributes to identify skyline landmarks. The main contrast between the algorithms is the attribute calculation method. Deng et al. [8] considered the separation between the landmark and the inquiry area of the user, while the researchers of this work consider the separation between the landmark and the path preset by the user.
Tian et al. [21] presented the idea of skyline paths. Their proposed algorithm would utilize the edge attributes of a road network to discover all skyline paths between the userspecified starting vertex and goal. The algorithm would first decide a single skyline path between a starting vertex and goal whose summation of all attributes values is the most reduced among all ways. At that point, the algorithm would recognize other skyline paths by (1) a greedy algorithm to locate a relay vertex between starting vertex s and goal t. If skyline path domination was available after adding the two values, at that point a can"t be a piece of a skyline path. In this instance, the algorithm again employs the greedy algorithm to identify other possible relay vertices or identify the next relay vertex following a.
Kriegel et al. [15] utilized the greedy algorithm to distinguish a possible relay vertex among s and t. Kriegel et al. [15] utilized a reference vertex to help estimations. By utilizing such a technique, they proclaimed the strategy proposed in this work was quicker than that proposed in crafted by [21].
Many researchers have looked to broaden the works in [15] and [21]. Aljubayrin et al. [1] examined the issue of skyline trips on different POI classes. Hsu et al. [10] connected the possibility of a skyline path to the arranging of treks to beat the conventional problem of acquiring multicriteria answers. Yang et al. [23] consolidated GPS history information in their inquiries to enable the user to design their skyline route under time-varying vulnerability. Unfortunately, these works don't consider aggregate attributes in road networks, which make them inapplicable to the issues tended to in this examination.
A new concept M-tree structure is described in [7]. A. Guttman al. [9] describes dynamic index structure called an Rtree and W. Son al. [20] describes spatial skyline queries for dynamic environment.
In [11], they focus on processing the continuous skyline query in road networks. They design a grid index to effectively manage the information of data objects. They proposed several algorithms combined with the grid index to answer the skyline queries.
In [12], they overcome the specific assumptions that each object is static in road networks. They focus on processing the CKNSQ over moving objects with uncertain dimensional values in Euclidean space and the velocity of each object (including the query object) varies within a known range.
Sheng et al. [19] present external memory algorithms for solving the skyline problem its variants in a worst-case efficient manner. They proved that the running time can be improved if some dimensions have small domains.
In [22], they bring out novel information by analyzing bulky databases to consolidate users experience to find place of interest. They use Apriori algorithm for identifying hidden association among item sets from large databases of user checking in data and to construct the route analogous to the key terms provided by user.
In this paper, we have considered two different scenarios. One is for business purpose and another is for individual user"s perspective. These scenarios are explained below.

A. Business Perspective
In applications such as Pathao and Uber, one trip can only be allotted to one passenger request at a specific time. In contrast, microbuses and cars have the capacity to carry five to eight passengers, respectively. Let us assume that a service provider has a microbus, which is a 10-seater vehicle. Suppose this person plans a trip from location L1 to L2. Before starting the journey, by utilizing our method this person can find shorter route as well as a faster route. Our method also suggests a route having a large number of passengers, whereas for a fast route this method provides a route with a shorter distance and lesser traffic. Here, the passengers are those whose destination location is the same as that of the service provider and the start location belongs to the list of suggested routes. Thereby, the service provider can choose its preferable route and accept the passenger request for the same. Table I shows the distance information on all routes and Fig. 3 presents the traffic and passenger conditions of two alternate routes. In Fig. 3, the graph lines are represented by three colours: green represents light traffic, orange represents medium traffic, and red indicates high traffic.
Hence, compared to multiple routes, it is necessary to find a desired route that is not dominated by any other route. In detail, a route is preferable to visitors if it is not dominated by any other route. The information on routes is given below. This information is collected from Google Map API.  The traffic and passenger conditions of two routes are graphically represented in Fig. 3. The route line colour changes with time. From this figure, we can say that there is a light traffic for Road 1 from 8.00 a.m. to 12 p.m., and Road 2 will be free after 2 p.m. It is also observed that there is heavy traffic for Road 1 from 3 p.m. and for Road 2 from 11 a.m. to 2 p.m. This graph is also helpful in tracing a medium traffic condition. Road 1 has medium traffic from 12 p.m. to 3 p.m. and Road 2 from 8 a.m. to 11 a.m. We can also calculate our travel cost from route distance. Table II represents the traffic and passenger conditions of every location for Route 1 and 2. Each row in the table represents traffic and passenger conditions. For measuring passenger we use normalize value. These are helpful in identifying the most interesting and preferable route.
Another graphical representation is given below. Fig. 4, represents the passenger condition with traffic. In this figure, blue colour Route 1 and green Road 2.
From Table II we find skyline points H3, H4.

B. Individual Perspective
Suppose a visitor wishes to travel from Location L1 to L2. Before starting their journey, he/she wishes to know about the route and traffic conditions as well as the cost of travel. Then, this method will provide him/her with traffic information using Google Map Traffic API and calculate the cost by calculating the fuel cost per litre, the mileage of their vehicle and the distance between their start and end locations. From the resulted dataset a user can easily filter routes according to choice. Table III represents road condition for specific interval. Each row in the table represents traffic and cost, which are helpful to identify the most interesting and preferable route.
From Table III, we find that cost is changing with user. Moreover, cost depends on user vehicle"s fuel cost and mileage.
In this paper, we compute a method that can help service providers to choose their desired route from our resulted skyline routes. Our location-based computation method can significantly find the appropriate route, based on the dataset. By this way, our method is useful for individual trip planning and transport service business planning.

A. Skyline Queries
Skyline query is a decision-supporting mechanism that highlights the best options among vast data.
An example is given below: In Fig. 5, we have some points in a two-dimensional space, as shown above, then we define a point p that will dominate point q provided its coordinates are larger than that of q. In this example, there is a point p that dominates several other points. So what is the skyline point? Skyline points are points that are not dominated by any other points present in the dataset. They are also called maximal points. If you connect these with horizontal and vertical lines, then you will get skyline points.
Let L denote a set of all locations. Each location has an ID and a spatial coordinate l = (xy). Let us suppose A is a category attribute. In our research work, passenger and traffic are category attributes. So, we denote the coordinates of location L by l. L; l.a represents the value of attribute A.
Definition 1 (Dominance Relationship): Given two objects a and a' exist, then object a is said to dominate a' if a < a' for all the attributes. Definition 2 (Skyline Query): Skyline query is the set of objects that cannot be dominated by any other object. Given point p, r ∈ D. If p < r, then p belongs to the skyline set.

B. Multi-Attribute Network Graph (MAG)
Graph G (V, E, W) is a multi-attribute network graph, where V denotes a set of vertices, E a set of edges, and W weight vector. In Fig. 6, nodes define profiles of activity, roles and actors etc. Edges define the relationship among those nodes or entities and weight defines the behaviour of the edges.  Our method comprises two modules: the first module delves into the business perspective and the second into the individual perspective. Fig. 7 describes the proposed framework. In each module, the user can provide the source and destination addresses while prioritising a specific destination based on his/her choice.
The business perspective and individual perspective operates in three processes or modules: processing module, query execution module and output module. Our functional algorithm, which parses the dataset and the filters, is known as the processing module. In terms of both perspectives, it works in five steps: first, it measures the geolocations of the start and end locations. Upon completion, an iteration process continues to measure all alternate routes from the source location to the destination location. Thereafter, it calculates traffic, trip cost or passengers based on the dataset. Thereafter, the process migrates into the query execution module, where a resulted dataset is generated imposing skyline queries. Through these processes, we get the dominant paths that are filtered later on the system output, which shows the result of these potential paths.
The most naïve approach to locating skyline paths in a road network is to identify all of the paths between the origin and destination in the network, calculate attributes of the paths, and perform a dominance check of all the attributes. The process of estimating traffic, cost and passenger are given in below. Fig. 8 describes the traffic condition at a specific time. Suppose we wish to assess the traffic conditions in all alternate routes from L1 to L2 at 2 pm. In this framework, car has been used as a transport mode. We get four routes from the given graph: Route 1 comprises L1, l5, l6, l2, l8, l9, l4, L2 and Route 2 comprises L1, l1, l3, l4, L2. Similarly Route 3 contains L1, l5, l6, l2, l3, l4, L2 and Route 4 contains L1, l1, l2, l8, l9,l4,L2. For assessing the traffic conditions at a specific time, Google Map Traffic API is used. For example, if someone wants to assess the traffic condition from L1 to L2 at 2 pm, then the system counts all alternate routes that he/she can take to reach the destination. Thereafter, it uses the latitude and longitude of a distance at every 0.5 km interval and checks the location at each iteration. Whenever a new location returns, we measure the traffic condition at those points by employing Google Map Traffic API. It provides the standard time and the time required to reach one"s destination, and then it stores all the data on the latter for every 0.5 km interval. In this way, we can obtain all the data on the time taken for all alternate routes. In this figure, the blue-coloured text represents the standard time (in minutes) taken to travel from one location to another. Another colour represents the time required to travel from one location to another. In this figure, three different colours are used: orange is used to represent medium traffic, green to indicate low traffic and red for heavy traffic. When the standard time is equal to the required time, the given time interval contains medium traffic. If the required time is low, it indicates the presence low traffic. Otherwise, the presence of heavy traffic is indicated.

A. Traffic Estimation
By this way, we gather data for our dataset. Tables IV, V, VI  and VII represent the required time for Route 1 Table III represents user wise cost for each route. Here, we represent how cost is changing with distance, mileage and fuel consumption. In our method, cost measures by using the following formula. Cost = (mileage/per ltr fuel cost) * Distance (1) Fig. 7. System Architecture.  Fig. 9 describes the condition of passenger at specific time; suppose, we need to assess the condition of passenger of all alternate routes from location L1 to L2. The passenger condition for specific time for each location can be assessed through passenger request. Fig. 8 shows the passenger condition. From this Fig. 8, we have found four alternate routes: Route 1 comprises L1, l5, l6, l2, l8, l9, l4, L2 and Route 2 comprises L1, l1, l3, l4, L2. Similarly Route 3 contains L1, l5, l6, l2, l3, l4, L2 and Route 4 contains L1, l1, l2, l8, l9,l4,L2. For example, we want to measure number of passengers for Route 1.At first, we count passengers of all location of Route 1, whose destination location is L2. Now get the maximum value from these locations. Suppose the value is P. We use the following formula to normalize location wise passengers.

C. Passenger Estimation
Tables VIII, IX, X and XI represent the condition of passengers for Route 1, Route 2, Route 3 and Route 4, respectively. Now, calculate the condition of passenger for each route. Route 1, Route 2, Route 3 and Route 4 has 50, 41, 40 and 51 passengers, respectively.

D. Computing Skyline
Here, we generate a dataset that measure attributes such as traffic, passenger, cost and distance.
Let"s consider a scenario. Suppose source is S and destination is D. There are ten alternate routes from S to D. We denote traffic condition as low, medium and high and define them as 1, 2, and 3 respectively. Table XII represents the route condition for a specific time.
From this dataset we need desire routes. By using BBS [18] algorithm we get our skyline routes. Fig 10 describes the BBS algorithm.
Using BBS algorithm, we get our skyline routes as R1, R2, R3 and R4. From this method, a user can easily find his/her desire route in proficient and appropriate way. If one wants a large passenger, low traffic and low cost route, then he/she can get the desired routes from the resulted routes.

VI. EXPERIMENTS
We have implemented our proposed system in .Net Framework. We have performed the experiment in a simulation environment of a PC running on windows OS having an Intel(R) Core i7, 1.73 GHz CPU and 4 GB main memory. Due to the lack of real data, we evaluate our proposed algorithm using synthetic datasets only. Fig. 11 shows the results when we consider Route 1, Route 2, Route 3 and Route 4. We observe that with the increases of distance, number of passengers varies. Fig. 12 shows that with the increase of distance, number of routes also increases.
In Fig. 13, when we consider two (2D), three (3D), four (4D), and five (5D) features. We observe that with the increases of routes, there is very slight increase in computation time. This is because during the computation process, time increases with the increase of number of routes. We can also observe that computation time gradually increases if the number of features increases.    Simultaneously, it is also observed that skyline points increase with the number of routes and number of features.  With the rapid growth of civilization, traffic is seen to increase day by day. Therefore, collecting traffic information, passenger condition and cost calculation has become a popular method. Our experimental results demonstrate that the proposed algorithm is scalable enough to compute the skyline path for a specific time. The proposed approach can easily expand for recommendation. In this work, we performed different analyses on synthetic data. In future, we aim to expand large passenger route methodology in more efficient way and find desire route based on user preference. So that we can get skyline points in more proper ways. We also want to trace the vehicle movement and position in a more efficient and effective way.