Performance Improvement of Network Coding for Heterogeneous Data Items with Scheduling Algorithms in Wireless Broadcast

This is the age of information. Now-a-days everyone communicates with each other by means of digital systems. Humans are always communicating with each other on the go. On-demand broadcasting is an efficient way to broadcast information according to user requests. In an on-demand broadcasting network, anyone can satisfy multiple clients in one broadcast which helps to fulfill the enormous demand of information by clients. The optimized flow of digital data in a network through the transmission of digital evidence about messages is called network coding. The “digital evidence” is composed of two or more messages. Network coding incorporated with data scheduling algorithms can further improve the performance of on-demand broadcasting networks. Using network coding, anyone can broadcast multiple data items using single broadcast strategy which can satisfy the needs of more clients. In this work, it is described that network coding cannot always maintain its superiority over non-network coding when the system handles different sized data items. However, the causes of performance reduction on network coding have been analyzed and THETA based dynamic threshold value integration strategy has been proposed through which the network coding can overcome its limitation for handling heterogeneous data items. In the proposed strategy, THETA based dynamic threshold will control which data item will be selected from the Client Relationship graph (CR-graph) so that large sized data items cannot be encoded with small sized data items. Simulation result shows some interesting performance comparison. Keywords—Network coding; scheduling algorithms; CR graph; wireless broadcast; simulation; LTSF; STOBS; performance metric


I. INTRODUCTION
Now-a-days almost everyone carries a portable cellular computing device from a laptop computer to smartphone. All these devices share information to the network on the go. This also requires an infrastructure that does not require a user to maintain a fixed connection in the network and allows mobility. Wireless networks require mobility, distributed sensing and city-wide internet connectivity. For broadcasting the data to the client, network coding uses the limited bandwidth of the wireless efficiently [17] [23]. Network Coding, as a field of study is young which was first introduced in [27] [30]. It is a new concept. Study on the performance of network coding shows that it can utilize the available limited bandwidth of the network to achieve improved throughput in multicast communication [16] [31] [32]. Network coding is applied on on-demand broadcasting network [14] [23] [28]. Here the server broadcasting the data has the information of every client it is broadcasting. Server uses this information to keep track of the data received by clients. Then the server encodes data and broadcasts them on the network. All the clients receive the encoded data and use its own received data to decode the encoded data. Using network coding, a server can serve multiple requests at the same time [17] [22].
Network coding can increase the performance of a broadcasting network in many aspects. It increases throughput, robustness, security in network as well as decreases deadline miss ratio, stretch, response time [17]. But while working on heterogeneous data items, network coding has some drawbacks [9]. It does not perform well as it has been on singular data items [29]. It is caused by the encoding technique which is used in network coding [15]. In XOR encoding, we encode the data items that are found in the maximum clique from the CR-graph [4] [7] [24] [25]. CRgraph is constructed through the data regarding clients' relationships of requested and cached data items [4] [7]. In the proposed THETA based dynamic threshold value integration strategy, the drawbacks of the traditional network coding approach in the scenario of heterogeneous data items have been minimized. Large sized data items and small sized data items have been filtered and encoded separately for improving the performance of network coding.
The rest of this paper is organized as follows. Section II contains related work. Section III illustrates the system model for implementing our proposed strategy. Section IV describes the performance evaluation. Our final thoughts are included in Section V.
II. RELATED WORK G. G. Md. Nawaz Ali, Yuxuan Meng et al. [1] performed simulation-based analysis based on top of generalized encoding model on both in homogenous as well as the heterogeneous environment for measuring the effectiveness plus adaptability of network coding assisted scheduling algorithms. They analyzed the performance of diverse scheduling algorithms both in non-coding then their proposed coding method utilizing dissimilar performance metrics. Yuxuan Meng, Edward Chan at al. [2] analyzed the effect of network coding with different scheduling algorithms. They conducted various experiments to measure the performance of broadcasts considering standard access moment, due date ignores relative amount along with typical stretch out. *Corresponding Author www.ijacsa.thesai.org Cheng Zhan, Victor C. S. Lee et al. [3] proposed a generalized framework so that data scheduling algorithms can be incorporated with network coding for broadcasting on demand requests. They described that with coding, performance can be improved using different scheduling algorithms. Jun Chen, Victor C. S. Lee et al. [4] proposed a new coding strategy named AC, for implementing an efficient coding mechanism. They also proposed two coding assisted algorithms named ADC-1 and ADC-2 considering data scheduling and network coding. Their simulation results showed that response time was dynamically reduced using both ADC-1 and ADC-2. They also showed that ADC-1 and ADC-2 performed better than conventional and other coding assisted algorithms.
Mohamed A. Sharaf and Panos K. Chrysanthis [8] proposed a new scheduling algorithm named STOBS-α for grouping requests and only one-time delivery of broadcasting results to the clients. Their proposed heuristic on demand algorithm was experimented using access time and fairness for mobile clients.

III. NEED OF THE IMPROVEMENTS
From studies it is noted that when there is no difference in data item size, there is no problem in encoding. For instance, if it is needed to encode three data item d1, d2 and d3 of unit size 1, the size of encoded data item d1⊕d2⊕d3 is also 1. But when we have to encode data items with different size then there is a slight problem. In this condition, the encoded data item's size is the size of the largest item selected for encoding. Let the size of d1 is 1 unit, d2 is 3 unit and d3 is 7 unit. Then the size of encoded data item d1⊕d2⊕d3 is 7 unit. In traditional network coding, large data items are selected with small data items for encoding which in terms cause performance reduction. That also leads to increased stretch and response time, thus hampering the performance of the network [12]. Traditional scheduling algorithms [5] [12] [18] are able to perform better than network coding in such conditions. For this reason, a new modified strategy in network coding has been established to handle heterogeneous data items with ease for maintaining an improved throughput, stretch and response time. The contribution of this paper is as follows: 1) To design a system model, where the server maintains the specification of network coding.
2) To implement the proposed modified strategy which will eliminate the drawback of network coding for heterogeneous data items.
3) To simulate, integrate and analyze our proposed approach with other existing basic scheduling algorithms and compare their performances.

A. System Architecture
To fulfill more requests earlier than their due dates as well as to assure operative utilization of the constrained bandwidth are the main goals of real-time scheduling and coding. Our system architecture is based on top of the conventional on demand broadcast set-up [4] [7] [10] [14] [18]. The architecture is shown in Fig. 1. The system is set aside by one server with a number of end devices. All end devices have a local cache along with provisions for a certain data core which is broadcasted by the server [1] [13]. Due in the direction of the obligatory room of the end device's caches, a certain guiding principle is applied intended for cache substitution. If the inquired data core cannot be initiated in its cache, the end device sends a request, and its active cache stand-in data to the server through an uplink tunnel [1]. All requests conceivably will necessitate auxiliary data portion from the server. Later than transfer requests headed for the server, end devices listen to the broadcast tunnel to recapture their requested data [1] [13]. It is presupposing that an end device doesn't cache this arriving encoded information but it cannot decode any asked data piece by utilizing this encoded information. If an end device gets and decodes every requested data substance earlier than the time limit, in that case, the requests can be content. In other cases, the request misses its time limit as well as there is no value to the end device [4].
On receiving a request, the server embeds it into a service queue. A request holds up to be scheduled in the service queue until every one of its requested data substances are broadcasted otherwise it gets to be infeasible for scheduling [1] [20]. When the leftover slow-moving phase is smaller than the compulsory phase obligatory towards broadcast every one of the leftover unprocessed data substances, the appeal is considered impossible to be scheduled [1]. A request is removed from the service queue and becomes infeasible, if it misses its required deadline [1]. The server primarily recovers the asked information substance put away within the local database based on top of certain scheduling algorithms then, in that case, encodes the data substance based on data concerning end devices' cached and requested data substance. Lastly, the server broadcasts the encoded information via the downlink tunnel. Inside our model, server and end devices purely exploit the basic XOR operations to encode and decode information [3] [7] [30]. Therefore, the encoding, as well as decoding operating cost and hold-up, can be overlooked.

B. Graph Model
Our graph model is based on the graph model proposed and discussed by Zhan at al. [3]. In this approach, the CRgraph is constructed on the THETA based threshold mechanism basis. The system has a data server S and n number of end devices E = {e 1 , e 2 , .... e n} . Set X(e i ) denotes the set of requested data items of end devices' e i and set Y(e i ) denotes the set of cached data items of end devices' e i . The server has a database which contains all the data items. Set m denotes the overall data items contained in the database D. Definition 1: Given E = e 1 , e 2 , .... e n , D = d 1 [6]. A graph G(V, E) can be built the same as follows: If we weigh up on-demand broadcast circumstances in Fig. 2(a) which consists of a server, S and five end devices, e 1 , e 2 , e 3 , e 4 and e 5 . Presume that the server has four data substances d 1 , d 2 , d 3 and d 4 . If we assume that end device e 1 has already stored d 2 , d 3 , d 4 in its cache from preceding broadcasting, end device e 2 has d 1 , d 2 , d 4 in its cache, end device e 3 has d 2 , d 3 , d 4 in its cache, end device e 4 has d 1 , d 2 , d 4 in its cache and end device e 5 has d 1 , d 2 , d 3 in its cache. Now if we assume that end device e 1 is requesting data item d 1 , end device e 2 and e 3 are requesting data item d 2 , end device e 4 is requesting data item d 3 and end device e 5 is requesting data item d 4 . The data sizes of d1, d2, d3 and d4 are 1 unit in addition to the broadcast transmission capacity is B=1, which infers the server can broadcast one information piece every time unit.
As of explanation 1, the diagram matching to Fig. 2 is developed as Fig. 3. Within this stature vertex V 11 speaks to the request from end device e 1 for data d 1 . End device e 1 has d 2 requested by e 2 and end device e 2 has d 1 requested by e 1 , there is an edge (V 11 , V 22 ). It is also shown that the end device e 3 has d 3 requested by e 4 and the end device e 4 has d 2 requested by e 3 , there is an edge (V 32 , V 43 ). Other edges are constructed by following the same rule.

C. Proposed THETA based Dynamic Threshold Calculation Strategy
The proposed THETA based dynamic threshold integration is based on the fact that large data sized items will not be encoded with small sized data items.
We construct the undirected graph G (V, E) according to G (V, E) as mentioned in III (B) section. Candidate vertex v mi (denotes data item d i requested by end device E m ) need to be initiated with V(G).
We need to make a decision which data items would be encoded together. For this reason, we have used dynamic thresholds. This gives better results. Threshold calculating process is given above. Here we have given an example. Suppose we need to broadcast 5-unit data items and 50-unit data items. If we choose 5-unit data items as candidates, the threshold would be around 3. So, 50 unit sized data items won't be in a set with a 5-unit data item set. Only, data item size in between 2 and 8 would come to consideration. Again, when 50-unit data items are candidate, the threshold would be around 15. So, 5 unit sized data item won't be in a set with 50. In this case the range of encoding would be 35 to 65. Though they produce different thresholds, both will face less stretch.
The proposed approach is to change THETA for each candidate by doing THETA= candidate requests requested data item size ÷ 3. But it is tried to choose THETA through a general equation. The proposed algorithm strategy is given below.
 At first, we make pairs which contain Client_Id and Requested_Data_Id.
 Then, we choose each request as a candidate request one by one. For calculating dynamic THETA, we have www.ijacsa.thesai.org used candidate requests requested data item size. Here, THETA is equal to either square of ln (candidate requests requested data item size) or candidate requests requested data item size ÷ ln (candidate requests requested data item size).
 By using THETA, we make a set. If the difference of candidate request's requested data item size and other requests requested data item size is less than THETA then this request is inserted into the set. We repeat this process for all candidate requests.
 From these sets, we find out maximum clique and then do network coding by broadcasting data.
If two vertices of a similar subset are linked through an edge in an undirected graph, it is considered as a clique [4] [26]. There are a number of preferences in the proposed methodology. We are using THETA based dynamic threshold integration. It helps to separate small data items from large data items. If small data items are encoded with large data items, they face large stretch and response time also increases.

A. Overview of Comparable Scheduling Algorithm
With the rapid growth of on-demand broadcasting networks, servers have to serve significantly large numbers of clients every day and it is always increasing. So to balance out the increasing load of servers, the necessity of new and improved scheduling algorithms is very high. In network coding, we have to incorporate scheduling algorithms according to our framework so that they maintain their characteristics for scheduling data items for normal networks. Two algorithms have been implemented using the system model (III-A). For heterogeneous data items, STOBS and LTSF scheduling algorithms perform efficiently better than other scheduling algorithms (FCFS, MRF, LMF and others) as those algorithms are generally implemented for single data item [11].

1) STOBS (Summary Tables On-Demand Broadcast Scheduler):
In STOBS, the server maintains a queue to store the requests of clients at the time of their arrival. This scheduler chooses a data item for broadcasting with highest (R*W)/S [3]   S: Table size.

2) LTSF (Longest Total Stretch First):
LTSF chooses a data item intended for broadcasting concurring to the order of the maximum total recent stretch [3] [12]. The data piece having the utmost whole current distend is broadcasted earliest [18]. Recent stretch records are calculated by the ratio of the waiting time of pending requests to its time of service [3] [9] [12] [18] [21].

B. Performance Metrics 1) Average response time:
Average response time is the ratio of the summation of altogether request's response time in the direction of the entire number of requests [8] [18] [21]. Requests are served quickly if the value of average response time is low.
2) Average stretch: Minimizing the average stretch for heterogeneous data items is the main issue considered for scheduling algorithms. We find average stretch in our simulation model using the following equation.
Average Stretch=Total response time for all end devices/Total service time for all end devices [8] [18] [21].

C. Simulation Model
We performed detailed analysis using CSIM19 [11]. The simulation parameters used for our system architecture (III-A) is shown in Table I [7]. In our simulation model, the server maintains a cache for every end device. At first end devices are generated with data items in their cache automatically. Then, they make their requests to the server maintaining an inter arrival time (IATM). We use IATM in accordance with average data item size. In our simulation: IATM = 100/Average data item size; If IATM is low, then end devices will make their requests more frequently which can overload the server.

D. Performance Analysis
The proposed THETA based dynamic threshold integration has been implemented using the system model described in III-A. Overall performance is analyzed and compared using two metrics: average response time and average stretch. We measured the performance by varying item size and cache size with our dynamically changing THETA. Simulation results show that there is significant increase of performance in the network coding environment with our proposed THETA based dynamic threshold www.ijacsa.thesai.org calculation strategy. The reason behind performance improvement is large sized data items are not being selected with small sized data items with our proposed strategy. response time for  varying item size with different cache size: Tables II, III

VI. CONCLUSION
Network coding can be widely used in the on demand wireless broadcast environment. However, it faces encoding problems while handling heterogeneous data items. It fails to provide the best possible solutions when different-sized data substance is encoded jointly. It faces a high stretch. Response time also increases when size differences of different data items are very high. Therefore, in this paper it is attempted to minimize the performance reduction difficulty of network coding in terms of heterogeneous data substance. Based on top of the generalized model proposed and discussed in, a new approach called THETA based dynamic threshold strategy has been introduced for encoding purposes. The proposed approach keeps in mind that large sized data items should not be encoded with small sized data items. The simulation results reveal interesting performance improvement of network coding. STOBS and LTSF scheduling algorithms have been used in this paper and the proposed THETA based dynamic threshold approach has been integrated with these two algorithms. With the proposed strategy, average stretch and average response time is dynamically reduced in a network coding environment. In future, other scheduling algorithms (FCFS, MRF, and LMF) can be integrated with the proposed strategy.