Modified Random Forest Approach for Resource Allocation in 5 G Network

According to annual visual network index (VNI) report by the year 2020, 4G will reach its maturity and incremental approach will not meet demand. Only way is to switch to newer generation of mobile technology called as 5G. Resource allocation is critical problem that impact 5G Network operation critically. Timely and accurate assessment of underutilized bandwidth to primary user is necessary in order to utilize it efficiently for increasing network efficiency. This paper presents a decision making system at Fusion center using modified Random Forest. Modified Random Forest is first trained using Database accumulated by measuring different network parameters and can take decision on allocation of resources. The Random Forest is retrained after fixed time interval, considering dynamic nature of network. We also test its performance in comparison with existing AND/OR logic decision logic at Fusion Center Keywords—5G; Cognitive Radio; Clustering; Fusion Centre; Random Forest


INTRODUCTION
History of modern communication starts with inception of electrical telegraph system, which uses Morse code for communication between two distant locations.
Due to limitation in existing technology i.e.only data can be sent in form of Morse code and at other end person should be there to decode Morse code information as well as sending long messages is not recommended.In an independent attempt made by Graham Bell in 1837, to transmit voice signal from one location to another, he invented a device which he named "Telephone".This communication technology revolutionize whole scenario and remain most popular means for coming century.In year 1948, Professor Shannon presented a paper on "A Mathematical Theory of Communication" [1].Idea presented in the paper was way ahead of its time, he suggested use of 0"s and 1"s for communication.Evolving itself and surrounding is nature of Human being, in the early 90"s internet came into existence as another mode for sharing and communicating information.In earlier times it was limited to wired communication.It limits users to use internet at static location and hence lack feature of mobility.Being a Human, we want our machines to act like us, mobility inspired researcher and hence result is mobile revolution.Like any other technology it is also continuously evolving itself i.e. 1G, 2G, 3G, 4G (currently deployed) and 5G (under Research domain) [2] [3], [4].
Question comes why we need another generation of mobile Technology?
It is stated by annual visual network index (VNI) report released by CISCO [5] that data explosion in wireless communication will continue in coming years also.VNI report also stated 4G Network will unable to handle network load with incremental approach, it will also reach to its maturity by the year 2020.
For introduction of 5G Network Technology in reality following requirements must be fulfilled [5]- [7]: Data rate is amount of data transfer per second per unit area.Considering 5G into scenario it would be 1000 times more than current 4G Network.
 Latency 4G has latency of 15ms, due to future demand of services like online gaming and virtual reality, latency in 5G Network should not exceed 1ms.

 Energy and cost
It is stated by researcher in 5G Network cost and energy consumption will reduce.Energy and cost is measured in Network with Joules/bit and cost/bit will fall up to 100 fold in 5G Network.

 Battery
Conserving battery life in Network is main concern for 5G Network consumption of battery life is reduce up to 10 times than the existing 4G Network.
If above requirement gets fulfilled, 5G will become reality which can offer fast connectivity without any constraint on all available devices.Following techniques enable 5G dream into reality:

 Densification of Network
Densification of network is done through deployment of large number of small cell; decreasing cell size is big challenge for researchers.Most recent development is in japan where cell spacing is reduced to 1/10 th of square kilometer.
 Massive MIMO www.ijacsa.thesai.orgMIMO was introduced in year 2006.It consider spatial dimension of communication, if multiple antennas are available at the Base Station.These techniques harness the multiplexing feature to get better results.Multi user-MIMO is included into 3GPP LTE-advanced standard, still higher capacity is yet to achieve.In VLM-MIMO, number of antennas per cell is larger than number of user, result in many desirable features.

 Millimeter (mm-wave) signals
In search of free frequency band researcher found 30-300 GHz are free available bandwidths.Total available spectrum in this range is 200 times greater than current used frequency i.e. 3GHz.

 Direct Device to Device (D2D)
It is exchange of data between mobile devices without Base Station in between, in result it reduce load over network devices which has to handle hundreds and thousands of requests simultaneously.It has to deal with heterogeneity of network, to support heterogeneity multiple protocols has been decided.

 Full duplex wireless
Full duplex enable both side to transmit data simultaneously in same frequency band, it has numerous benefit it increase physical layer capacity up to twice and also improves latency and security at physical layer.

A. Network Architecture
In Heterogeneous Network environment where network is flooded with numerous devices, architectures and protocols standardization of network is very important.Currently 3GPP is finalizing LTE-Rel-12 (third release of LTE-advanced family) [5], [7], [8].It is expected that standardization of 5G will not be finalized until Rel-14 and Rel-15.Currently there are numerous architecture and proposed protocol existing for the 5G network.Many organizations are currently working to make advancement in 5G Network Technology at their own level with collaboration of different universities and Telecom industrial funding.Such as METIS project in Europe, IMT-2020 of China, 5G forum of Korea and ADWICS of Japan.One of the most acclaimed architecture for futuristic 5G Technology is to divide Network on the basis of rights over Bandwidth.Rights over Bandwidth is allocated to two class of users named as Primary Users and Secondary Users [9]- [14].Primary users are class of users who has basic rights over bandwidth and Secondary user get control on Bandwidth whenever Primary Users are ideal.
The platform is set for the working of Cognitive Radio, in which Secondary User has to intelligently detect which bandwidth is under use and which is not.This optimizes use of available bandwidth while minimizing interference to other Secondary users and Primary Users.
Main challenge is how to detect the vacant Bandwidth?Solution comes from hybridizing Basic Network architecture for 5G with one of the proposed architecture for wireless sensor Network.
Densification of Network is done through massive deployment of cell in the heterogeneous Network Environment, with the advancement in cell Technology cell size is getting reduced and currently it is 1/10 th square kilometer in Japan [5].Sometimes these small cell is known from the name Base Station [10], [13]- [15], work of these Base Station is to collect signals from different Mobile Devices active in its coverage area.There are many optimization techniques suggested by Researchers for collecting data from different mobile devices.
Embedding the concept of Wireless Sensor Network in 5g network environment, base station enables communication between each other from a multi-hop network.Cost of transmitting is higher than computation [5], [16], [17], because of this base Station are organized into Clusters.
In the Cluster environment of the Base station, data is collected by central processing center which is a specialized device for entertaining request made by base station for resource allocation.
In the cluster environment, each cluster select its cluster head through many proposed methods by researchers [18], [19].
There are two basic scheme of clustering Base Stations: In this clustering approach [16], [17], all Base station are clustered having each cluster its own Cluster Head.Each cluster Head directly transmits its signal to data processing center also known as fusion center in 5G Network.

 Hierarchical clustering
As suggested in [16], [17], authors assumed there j levels of clustering.Level 1 is at lowest level and level j is highest level.In hierarchical environment each lower id cluster head transmits data to its immediate upper layer cluster head and so on.www.ijacsa.thesai.org

A. 5G and Cognitive Radio
Before idea of wireless communication network, wired communication was very much popular up to late 90"s.dawn of mobile communication technology ignited light for rapid evolution in mobile generation like 1G,2G,3G and 4G [2], [3], [26], [27].Researchers contributed a lot for development in different sphere of network.These attribute are performance, architecture and cost.Being 5G as futuristic technology it has many challenges and requirements issues like spectrum allocation, speed, cost and data traffic [7], [8], [28].Challenges can be conquered if network has strong architecture in this regard a lot of research work has been done.Most of researchers have suggested implementation of massive MIMO, application of millimeter wave, device to device communication and embedding cognitive radio with 5G, full duplex communication, ultra dense network using small cell and inference management [5], [6].Continuous works are carrying on by researchers, for each of the suggested improvement parameter mentioned above [20], [21], [22].
Since spectrum being most vital resource in network but spectrum allocation and sensing is one of the challenging task for research community where reliability and accuracy matters a lot.Researchers suggested many models co-operative spectrum sensing [29]- [31] is very much popular in which two type of users, primary user and secondary user.Both collaborate with each other for spectrum sharing with the help of active and inactive phase.For sensing spectrum, an intelligent software radio was suggested also known as cognitive radio [9]- [13], [32]- [51].In 2003, Freidreich K. Jondral et al. paper [52] on smart radio later known from name cognitive radio.Many of research work are available on cognitive radio explaining, modifying and implementing cognitive radio technology [9]- [13], [32]- [51].
Cognitive radio is an "intelligent communication system that is aware of its surrounding environment, and uses the methodology of understanding by building to learn from environment and adapt its internal states to statistical variation in the incoming radio frequency stimuli by making certain changes in operating parameter in real time, with two primary objectives in mind: highly reliable communication and whenever needed efficient utilization of radio spectrum."Haykins.For implementation of cognitive radio a tool is needed, machine learning [31] is pioneer candidate among all of them because it learn from past data and derive knowledge base from it and able to take decision with any manual help.In 2013, paper [31] named "Machine Learning Techniques for Cooperative Spectrum Sensing in Cognitive Radio Networks" which discuss all major machine learning algorithms and how they can be used for spectrum sensing in cognitive radio network.After publishing of this paper research community showed interest for using machine learning algorithm for spectrum sensing.Machine learning can be implemented in many other area of network and much more can be achieved through it in network scenario.

B. Random Forest
In year 2001, Leo Breiman, a statistician identifies the problems in existing machine learning techniques [53].In earlier tree approach of machine learning data set is not evenly distributed lead to imbalance of data.Imbalanced data set performance is poor with the classification, this lead to miss classification and error in the training phase.Leo Breiman in suggested a new machine learning algorithm to improve the classification of diverse data, it used his own "bagging" idea [23] and Ho and Amit and Geman"s random selection [54] to construct number of Decision Tree with control variance.He suggested data set were collected and then divided into two or more subset of data, where one or more data set used as learner and remaining is used for test purpose.[25] done under SPRINT which uses 128 process and speed up performance up to 50 times over the serial code.Basic problem with this approach is that it takes only time complexity under consideration leaving the space complexity.Because of this pruning approach is required.Basically two approaches are followed one is static and other one is dynamic approach [25].Static approach www.ijacsa.thesai.orgfollow overproduce and choose characteristic [25], [55] because decision trees are first overproduce in Random Forest to pre decided number and then in the choose strategy best decision tree got selected.Whereas dynamic pruning do not have over produce phase [25] .Saving more time when comparison to static approach but unfortunately it is hard to implement because of this researchers are also not showing interest for this approach.Research work done under static pruning approach fall in to three majority categories: In one pioneer work [25]of static pruning, genetic algorithm is used to select most optimal candidate from pool of Decision Tree.Other work uses elimination [25], [55] of similar Decision Tree if their output class and accuracy are same then keep single copy of Tree eliminating others.Dynamic pruning require help of statistics and probability along with nature inspired algorithm to get better results.In one approach [25] authors has tried to model dynamic pruning approach with the help of eight degree mathematical equation.

C. 5G in Machine Learning
Machine learning algorithms are either classifiers or regression.They take the help of past data and accumulate knowledge from it and also take decision.It has been demand from long time to implement machine learning algorithm for network scenario and utilize strength of it for increasing reliability and accuracy of 5G.In network, there are various place where machine learning algorithm can be used.For clustering of mobile node, various clustering algorithm can be used.Researchers also pointed out various suitable algorithms for the clustering in network scenario.Hierarchical and weighted clustering algorithm is generally suggested by researcher [18], [19], [56], [57].Some researchers also suggest modified version of both the algorithm to achieve better output.Other place where machine learning algorithm can be used is for spectrum sensing.In paper [31] author has suggested many such algorithm for sensing vacant spectrum sensing.Further extension of spectrum sensing is spectrum allocation, machine learning algorithms are very much capable of handling both load without manual help i.e. totally autonomous system.In some of the paper researchers also fuse probability and machine learning for spectrum sensing.Nature inspired algorithm can also be used for implementation in 5G.One author has already used genetic algorithm [58], [59], he matched genetic parameter with network parameter, and with the help of it he has predicted variation in network parameters.

A. Assumptions
5G network is very dynamic in nature, data collected from mobile nodes and implementation of decision making mechanism at fusion center has to solve certain issues.
 The proposed method assumes there should be no link failure between CH and FC.
 Fair transmission of data between FC and CH without any delay.
 No assumption has been taken about working environment of FC.

B. Research Gap
The 5G technology is futuristic, for implementation of 5G into reality, there are numerous issues has to be conquer.One of the most challenging issue which is still untouched by researcher is taking most optimal decision at Fusion Center (FC).In Network spectrum allocation is a crucial issue, reason is limited bandwidth whereas number of users is always more and increasing exponentially.FC is totally responsible for bandwidth allocation, a single mistake in spectrum allocation will cost a lot to the Network.Following consequences may arise due wrong allocation of spectrum by the Fusion Center:

1) Wastage of bandwidth 2) It may lead to monopolization of spectrum.
3) It may cause deadlock to the Network.

A. Proposed Method
Mobile nodes in the 5G Network are of two type primary user and secondary user.All mobile node send its information to the Cluster Head.Information send to the CH consist of attribute and its value for particular mobile node.Attributes values explain about status of mobile node in the Network, these attribute are battery life of mobile node, distance of mobile node from the CH, weather it is primary user or secondary user, signal to noise ratio, if node is primary user weather it is using Bandwidth or not and many other attributes [16].Each cluster head send these information of mobile nodes to the fusion center, a central hub equipped with all essential hardware and Software components.Architecture of Fusion center must consist of following component:  Display unit for monitoring the output of processing unit  Input unit for feeding algorithms and different variables to the processing system.This decision value illustrate about under certain values of attribute which cluster has got the spectrum allocation, this help Random Forest algorithm in learning process.Dynamic Database is continuously coming information from cluster head which is treated as request for spectrum allocation, after taking Decision making for request particular table is stored into static database for further learning of Random forest which is repeated after fixed time interval considering dynamic nature of network where different cluster and mobile nodes are getting detached and attached to network continuously, because of this reason we need to re train our Random Forest after fixed time interval, so that it can take most optimal and accurate decisions.In future work of paper we will also consider the problem of re training the Random Forest in real time duration with the dynamic database.Current work is based on decision making at fusion center using static database which reduces time complexity of random forest using static pruning methodology applying concept of clustering (K-mean) [60] of Decision Tree, which is illustrated through following steps:  be the accuracy of ith cluster's centroid.'M' be the total number of clusters.
be the weight of ith cluster N be the total number of Decision Tree in the Random Forest.
is the number of Decision tree selected from ith cluster.a. Below formula illustrate about sum of all cluster's centroid.∑ b. Repeat this step M number of times c. Repeat this step M number of times If value is decimal take the floor value, if total number of tree in the particular cluster is less than value select all Decision Tree from cluster.10.Use voting method for class selection i.e. selection of cluster for spectrum allocation (this is mobile node cluster do not confuse with DT cluster ) 11. Allocate spectrum to most voted cluster.
As par we have gone through all literature survey, this approach is not implemented by any researcher for designing decision making system at fusion center.There exist no research work done until yet for smart decision making for fusion center, currently AND/OR logic is available to take decision at fusion center.inthe next section of paper we will show experimental results comparing our approach with traditional AND/OR logic approach.After going through a lot of research paper Random Forest seems to be most suitable in deployment of 5G Network [61]- [64], reason is accuracy and capability to become classifier of future generation.Since, 5G is also futuristic technology it require a strong and more accurate classifier.Main concern of Researchers of Random Forest is to enhance performance if we want to use it under network condition.Paper contributed to decrease execution time by using clustering approach and reduce the time for classification.We will also do a comparison of our proposed algorithm with traditional Breiman Random Forest model [53] in very next section and check which algorithm requires less time.To have clear understanding of concepts in proposed work a diagram is given below which explain each steps clearly.
V. RESULT ANALYSIS 5G is not implemented yet, due to this reason for the result analysis we have to collected data from different resources.Many research organization is currently working on this futuristic technology we want to thank Nokia Telecommunication, Samsung research laboratories and www.ijacsa.thesai.orgMETIS project for their support in my experimental work [16], [17], [28].It is due to them we are able to gather all key features and their values.For assembling these data provide we have taken the help of different researcher working in this field, there research work [14], [30], [65]- [68], [59]has guided us to assemble these value.
For our experiment purpose, we have created datasets representing small database of fusion center where each row represent information of single mobile node and column represent attribute values.Experiment is conducted on dataset having 153 rows and 7 columns for static database.Since, column represent attributes these attributes are battery life (b_life) which explains about remaining battery life of mobile node, distance from cluster head (dist_CH) as the name describe it describe about mobile node distance from its cluster head, type of mobile node (mob_node) since we know that there two type of user on the bases of spectrum rights it explains about weither it is primary or secondary user, vacant bandwidth (vac_band) this is important feature for the primary user because on the bases of this field, it is understood primary user or secondary user is utilizing bandwidth allocated to them, signal strength (sig_strenth) signal strength is measure on the scale of dBm , cluster identity (c_id) describes about mobile node belong to which cluster many mobile node can have same cluster id and output (o_put) tells about under certain condition spectrum is allocated to which cluster.Static database is used for training multiple Decision trees.Portion of dataset is shown below: From the above table it is easy to deduce that if primary user (represented by 1) and secondary user (represented by 0) is using its allocated bandwidth, bandwidth cannot be allocated to another user or cluster and represented in vac_band column with 1 showing user is using allocated bandwidth.In this case both c_id and o_put both column values are same for particular rows.These values in the table can be dropped because cannot be used for m training classifiers (Random Forest).This result in reduce in the table size.Result analysis cannot be possible without implementation of proposed algorithm and we also need a tool for comparison of existing approach with proposed approach.For this purpose, R seems to be the best choice because of its plug and play libraries and ease of use.For execution of algorithm all parameters were kept same in hardware and software manner.Both the algorithm were run on xenon processor with clock speed 3.45Ghz.All results shown below in two sub-section is implemented in latest version of R i.e.R 3.2.3.

A. Modified Random Forest Algoritm
Our purpose for proposed algorithm is to enhance performance (reduce execution time of Random Forest algorithm) without losing accuracy measure.Experiment is conducted on three data set i.e. modified Random Forest and Breiman Random Forest is evaluated using three data set on back ground of accuracy and Performance.All three data sets used in the experiment are shown in appendix section.We have also done portioning of each of data set into three parts those parts are training Data set, validation data set and testing data set.Data set with same portioning value is applied to both algorithms.Another parameter for comparison is the time complexity, both the algorithm requires time in mille seconds using R editor for implementation.From the results produced after execution of both algorithms it is clear that proposed modified Random Forest approach needs less time for every data set taken for experiment.Hence considering performance factor proposed algorithm is more impact full than existing Breiman"s Random Forest.Analysis of results can be done with the help of two graphs given below:

B. Modified Random Forest Algorithm at the Fusion Centre
New algorithm is used to increase accuracy of spectrum allocation.So that wise utilization of spectrum can take place.For this purpose, machine learning technique is used because it take decision based on prior knowledge accumulation.Random Forest is most optimal among all machine learning techniques considering accuracy as a parameter under network load condition.Up to know statement was just a hypothesis, results are very much promising.Comparing results of the modified Random Forest algorithm with existing AND/OR logic approach of decision making at fusion center, we can see significant improvement in accuracy for spectrum allocation and spectrum allocation time to the cluster head also decreases and perform faster than AND/OR logic.Experimental result can be analyzed from graph shown below, all three data set is used for experimental output generation.


A receiver antenna for collecting all information from different Cluster Head.www.ijacsa.thesai.org Processing unit either single unit or multiple unit for parallel processing  Database storage  Transmitter unit for broadcasting output of processing unit to the cluster Heads.

Fig. 3 .
Fig. 3. Working of Random Forest AlgorithmB.Working of Random Forest AlgorithmAll manipulation work on the Database is done here at the Processing Unit.Processing unit fetch data set from Database, there are two type of Database, static and dynamic.Static database is source for training Random Forest it contain tables in form of rows and columns where rows represent past recorded information about each mobile node and column represent attribute values along with decision attribute value for the table.This decision value illustrate about under certain values of attribute which cluster has got the spectrum allocation, this help Random Forest algorithm in learning process.Dynamic Database is continuously coming information from cluster head which is treated as request for spectrum allocation, after taking Decision making for request particular table is stored into static database for further learning of Random forest which is repeated after fixed time interval considering dynamic nature of network where different cluster and mobile nodes are getting detached and attached to network continuously, because of this reason we need to re train our Random Forest after fixed time interval, so that it can take most optimal and accurate decisions.In future work of paper we will also consider the problem of re training the Random Forest in real time duration with the dynamic database.Current work is based on decision making at fusion center using static database which reduces time complexity of random forest using static pruning methodology applying concept of clustering (K-mean)[60] of Decision Tree, which is illustrated through following steps:

3 .
Build Decision Tree on each of the sub samples, applying random sub sampling of attributes.4. Measure accuracy of each of the Decision Tree. 5. Name each of the Decision Tree 6. Store each of the Decision Tree accuracy along with its respective names.7. Apply K-mean clustering algorithm for clustering Decision Tree on the basis of accuracy measure.8. Calculate centroid of each cluster 9. Apply following formula: Let 'S' be the sum of all cluster's centroid.

Fig.
Fig. 4. Bar Graph of data set splitted for Random Forest

Fig. 5 .
Fig. 5. Accuracy Comparison of actual method and the suggested method

Fig. 6 .
Fig. 6.Performance comparison of actual Breimann"s Random Forest and the modified Random Forest Many researchers got attracted towards Random Forest approach of handling data set and started working on different attributes of Random Forest like features, concepts, analysis and modification of the proposed model of Random Forest algorithm.Research works going on in the field of Random Forest can be broadly classified into three categories:

TABLE I .
DATASET COLLECTED FROM MOBILE NODE AND CLUSTER

TABLE II
4. Bar Graph of data set splitted for Random Forest Through bar graph it is illustrated in data set 1, 63% of data set rows are allocated for training purpose of Random Forest, 20% for the validation and remaining 17% is allocated for testing.Similarly in the data set 2, 65%, 22% and 13% is allocated for training, validation and testing purpose respectively.Similar splitting of data set is done in the data set 3 also with 60%, 21% and 19% for training, validation and testing purpose.
. 7. Comparison of accuracy in decision making at the fusion centre VI.CONCLUSION This paper gives a new algorithm, Modified Random Forest Algorithm for 5G.Due to the use of this algorithm we can see that there is an improvement in the performance of the 5G network for resource allocation.In future the work can be extended to the full 5G network rather than only for resource allocation.