Empirical Assessment of Ensemble based Approaches to Classify Imbalanced Data in Binary Classification

Classifying imbalanced data with traditional classifiers is a huge challenge now-a-days. Imbalance data is a situation wherein the ratio of data within classes is not same. Many real life situations deal with such problems e.g. Web spam detection, Credit card frauds, and fraudulent telephone calls. The problem exists everywhere when our objective is to identify exceptional cases. The problem is handled by researchers either by modifying the existing classifications methods or by developing new methods. This paper review ensemble based approaches (Boosting and Bagging based) designed to address imbalance in classes by focusing on binary classification. We compared 6 Boosting based, 7 Bagging based and 2 hybrid ensembles for their performance in imbalance domain. We use KEEL tool to evaluate the performance of these methods by implementing the methods on seven imbalance data having class imbalance ratio from 1.82 to as high as 129.44. Area Under the curve (AUC) parameter is recorded as the performance metric. We also statistically analyzed the methods using Friedman rank test and Wilcoxon Matched Pair signed rank test to strengthen the visual interpretations. After analysis, it is proved that RusBoost ensemble outperformed every other ensemble in the imbalanced data situations. Keywords—Ensemble approaches; boosting; bagging; hybrid ensembles; imbalanced data-sets; classification


I. INTRODUCTION
Classification process is very important in solving many real time problems.Various types of classifiers have been proposed in research field to solve classification problems.These classifiers only gives satisfactory results where the real time problems are represented by balanced data-set (the proportion of size of data classes is same).But sometimes, there are circumstances wherein we want to do the classification when the data-set is not balanced (proportion of size of data classes is not same) e.g.Web Spam Detection, Credit Card Frauds, Fraudulent Telephone calls etc.In such cases, if we apply the classification methods which are designed to classify balanced data sets, we will not get the accurate results.The major problem with imbalanced data set is that the data points belong to majority class (bigger class) impacts the classifier decision boundaries at the cost of minority class (smaller class) which is represented by very few points compare to majority class.This concern is known with the name as class imbalance problem in the research community.The extent of imbalance in data can be measured with class imbalance ratio (CIR).CIR is the percentage of size of majority class to the size of minority class.CIR value is indirectly related to the size of minority class.CIR with high value is considered as highly imbalanced data.Various types of solutions are developed by research community to handle this problem.Methods developed to resolve this issue can be divided into three major categories.Data level, algorithm level and the combination of data and algorithm level (hybrid) approaches.In data level approaches, data is pre-processed for balancing the dataset before classification.The biggest benefit of this category is that one can use the existing classification methods which are developed to classify balanced data-sets.Researchers have applied different logics for balancing the data.Some methods balance the data by synthetically generating the data-points within minority class either randomly copying the existing data or by applying some intelligent process to generate synthetic data [1][2][3][4][5][6][7][8][9][10][11].These types of methods come under the category of oversampling methods.The limitation with random oversampling by copying the existing data may lead to overfitting.In case of the noisy data-sets, random oversampling may lead to the increase of noise within the data-set [12,13].Some methods balance the data by removing data points from majority class either randomly or by using some intelligent concept before classification [13][14][15][16][17][18][19].The biggest limitation with random undersampling is the loss of some important information.These type of methods are called undersampling methods.There is another sub-type of data-level methods wherein we combine the concept of oversampling and undersampling to balance the dataset before classification.These types of methods are known as hybrid data level methods [20,21].In algorithm level category, the researchers either modified the existing classification methods by working on the biasness of classifier towards the bigger class or by developing new methods to handle imbalanced data [22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38].The third category, known as hybrid methods, combines data-level and algorithm level methods to boost the classifiers performance for imbalance data [39][40][41][42][43][44][45][46][47][48][49][50][51][52][53].Many researchers combine datalevel methods and algorithm level methods using ensemble concept to enhance the performance of earlier classification methods which were using only single classifier for getting results.Ensemble concept uses multiple classifiers for better predictive results compare to the methods which uses only single classifier to obtain the results.In this paper, we review ensemble based classification techniques which uses Bagging and Boosting concept to handle imbalance data-sets.We empirically assessed the methods using KEEL tool [54,55] and statistically analyzed the results using Friedman [56] and Wilcoxon Matched pair signed rank [57] tests.Section II explains the idea of ensembles and review Boosting based and Bagging based ensembles designed to resolve class imbalance www.ijacsa.thesai.orgissue.It also describes the performance criteria used in this paper to assess the performance of methods.Empirical calculation of ensembles approaches and their statistical analysis is discussed in Section III followed by conclusions in Section IV.

A. Fundamentals of Ensemble Approach
Ensemble approaches train more than one classifiers to resolve the same issue.This method is also named committeebased learning or learning through more than one classifier systems.Fig. 1 describes the model of ensemble approach.The area of ensemble approaches actually generated from three sections i.e. combining more than one classifiers, ensembles of weak classifiers and combination of experts [58].Combining classifiers concept was mostly studied under pattern recognition area wherein the researchers works on strong classifiers and try to design powerful combining rules to get stronger combined classifiers.Ensemble of weak classifier is mostly studied by machine learning community wherein the researchers work upon weak classifiers to design powerful procedures for boosting the performance from weak to strong.This area has designed vary famous ensemble methods like AdaBoost [59] and Bagging etc. Combination of experts is studied by neural network community wherein the researchers usually consider a divide-and-conquer scheme to learn a combination of parametric prototypes jointly.
Ensemble methods are popular learning paradigm [58] since 1990"s.It is because of two main pioneering work proposed in literature.One, which has empirically proved [60], analyzed that the outcomes resulted from a set of learners are found more precise than the results given by a single finest classifier as displayed in Fig. 2. The other, theory concept proven by Schapire [61] is that the weak base learners can be enhanced to strong learners.As strong classifiers are needed to solve many real time problems which are not possible to solve using weak classifiers, this need has motivated the researchers to generate strong classifiers by using ensemble methods.Ensemble methods use multiple classification procedures to attain better predictive results.Under this approach, various classifiers are trained either parallel or sequentially to resolve the same problem.An ensemble is created using two steps, by selecting the base classifier and then joining them to make ensemble of classifiers.Performance of ensemble methods can be decided by two factors: Accuracy of the individual learner and diversity among all classifiers.Ensemble"s accuracy is directly related to the selection of base classifier.It is widely accepted [62] that improvement in the overall predictive accuracy by the ensemble can occur only if there is diversity among its components i.e. if individual classifiers don"t always agree.Diversity is the measure to which a classifier can make different decisions on a single problem.Various ways can be used to measure diversity like by manipulating training patterns (cross-validation, bagging, boosting), by manipulating input features (by considering subset of features for classifier learning) and by incorporating random noise.Major research in literature belongs to homogeneous ensembles than heterogeneous ensembles wherein we use combinations of different classifiers to produce results.But heterogeneous ensembles can produce more diversified results than homogeneous ensembles [66].Computational complexity is very high in case of generating a single classifier than the ensemble.Because, while generating single classifier, for better performance it is essential to design various versions and tuning parameters for better model selection, whereas, the computational complexity in combining different classifiers is very less.Ensemble approaches reviewed in the paper are shown in Fig. 3.

B. Ensembles based upon Boosting Concept
Ensembles are categorized into two models namely Boosting and Bagging, based on the methodology of joining base classifiers.Boosting method converts the weak classifier to strong classifier by sequentially generating the base classifier hence it goes in the category of sequential ensemble paradigm [58].In a boosting process, initially a model is build using initial training data, then another model is created whose purpose is to correct the errors from the model generated from previous model.This process is repeated until the perfect prediction is done or a maximum number of models are generated.Various ensembles, based upon boosting concept, reviewed during current study are described as below: 1) Adaptive boosting method (AdaBoost): AdaBoost, [59] the first successful algorithm proposed by Freund and Schapire in 1996 using boosting concept for binary classification.In AdaBoost, we used complete data-set for training every classifier serially.After every iteration, the method assigns more weight to the misclassified data points, with the objective of accurately classifying the misclassified data points recognized during current iteration, in the next iteration.Hence, its main objective is to emphasize on the data points whose classification is predicted as hard.Weight allocated to the misclassified data points after every iteration is directly related to the status of misclassified data i.e.How hard it is to classify that data point.Weight is initially equally assigned to all the data.After every iteration, the weights allocated to misclassified data points are increased and allocated to correctly classified data points are decreased.Lastly, when an unknown data point is submitted, every classifier vote for it and the data point is finally allocated to the class based upon the majority votes.It is named adaptive as it is build using multiple repetitions for creating a strong classifier.The drawback of AdaBoost is that it allocates equal weights to the classes and is internally developed to detect equal size of classes (for balanced data-sets).In imbalanced scenarios, its results are always in the favor of majority class.Therefore, to handle this biasness towards majority class, many researchers updated equal weight situation of Adaboost method so as to modify the method to detect minority class accurately.Fig. 4 shows the procedure of AdaBoost.
2) Smoteboost: N. V. Chawla in 2002 proposed SmoteBoost [3] by modifying AdaBoost to address imbalance problem in classes.SmoteBoost combines an oversampling method SMOTE with standard boosting process.It generates synthetic data inside minority class using SMOTE process during every iteration of AdaBoost.The weights assigned to synthetically generated data remains constant and depend on the aggregate sum of information in the new data-set, whereas the weights assigned to the original data points are normalized so as to form a distribution with the new generated data points.When the classifier is trained, the weights assigned to the original data points are updated.Again the synthetic data is generated in another phase and weights are modified so as to match the weight distribution.This process repeats itself till we get the required predictions or extreme number of classifiers are build.Limitation with the method is that it uses oversampling method to balance the data by generating synthetic data points therefore it is more computationally expensive compare to the methods that are based on undersampling approaches.Another limitation of SmoteBoost is that in case of noisy data-sets it may end up by increasing noisy data-points by random selection of the noise points as a candidate to produce synthetic data-points [12,13].
3) Databoost-IM: Guo and Victor in 2004 proposed another boosting based method, namely DataBoost-IM [49], by combining boosting with data-generation to improve the predictive capabilities of classifiers for binary imbalance datasets.Its working principle is unlike SmoteBoost as it, firstly, identify and separate data points which are hard to predict, from both minority as well as majority class, to produce synthetic data-points.It also considers bias information towards hard to predict data points to produce synthetic data on which the classifier from next iteration needs to focus.In this process, the weights assigned to both the classes in the new training set are re-balanced so that boosting procedure can focus on both the classes.Hence, this method focused on refining the prediction ability of both the majority and minority class.The principle drawback [63] of this procedure is that it can"t manage very high imbalanced situations in light of the fact that it creates an extensive amount of data points which are troublesome toward oversee by the base classifier.[41], which incorporates MSmote data level method in every iteration of AdaBoost.Unlike Smote, MSmoteBoost remove noise datapoints and consider the distribution of minority class.Minority class data is divided into three groups as border, security and latent noise points.The data points are categorized based on the distance from other data points, before generating synthetic points.Security data points are those which can strengthen the performance and noise points can reduce the performance of classifier.Hard to predict data are recognized as border category.The method processes the data differently with these categories while producing synthetic data points.The weights assigned to the new data points are based on the total number of points in the new data-set.Hence, their weights always remain constant, whereas original data-set's data point"s weights are normalized so that they form a distribution with the new generated data points.The assigned weights of the original data points are updated after training the classifier.
The process repeats itself till the strong classifier is build.As this classifier is also using oversampling approach, Its computational cost is also high compare to the ensembles based on undersampling.

5) Random undersampling boosting (RusBoost):
In 2010, another boosting based ensemble is proposed.It is dissimilar from SmoteBoost because it incorporates undersampling data level method (Rus) in every iteration of AdaBoost with the motive of proposing a simple classifier which can work with fast speed than using any oversampling approach.RusBoost [39] removes data points randomly from majority class in every iteration of AdaBoost.RusBoost doesn"t allocate new weights to the data points.It is sufficient to normalize the weights of the remaining data points in the new data-set according to the total sum of weights.After the classifier is trained, the process updates the weights of the original dataset.The process is repeated till we get the strong classifier.The inspiration of combining Random undersampling and boosting method is its simplicity, performance and speed.As the data set is balanced by removing data therefore time needed to build a model is low compare to oversampling models.Loss of required information is the major limitation because no intelligent method is used to eliminate data from the majority class.Another disadvantage is during noisy environments, it may end up removing good data from classes due to which there is more impact of noise on the classifier"s performance [12,13].
6) Evolutionary undersampling boosting (EusBoost): In 2013, an intelligent undersampling based ensemble, EusBoost, is proposed which incorporates EUS [40] preprocessing method in every iteration of AdaBoost.The basis of EusBoost [40] is RusBoost, which is simplest method compare to other oversampling approaches.EuaBoost enhances the classifier performance by the using the evolutionary undersampling approach.The key principle of EusBoost is diversity mechanism by considering different subset of data in every iteration.

C. Ensembles based upon Bagging (Bootstrap Aggregation) Concept
Bagging, like Boosting, also build a strong classifier by combining multiple weak classifiers for the better performance compare to using single classifier.Bagging [64] gives the best results if the problem using single classifier is overfitting.Unlike Boosting, any data point in bagging has the same probability to appear in a new data-set.The process of bagging starts by creating sub-sets from the data-set.Then each sub-set of data-set is trained independently using classifier that results in ensemble of different models.Then average of all these different models are used to build a strong classifier.It brings diversity by using different data-sets for every classifier.Hence, bagging comes under the category of parallel ensemble methods.The inspiration behind these ensemble methods is to exploit the independence between the weak classifiers [58].Fig. 5 shows the bagging algorithm.

1) Smotebagging:
SmoteBagging [65] combines oversampling of minority class using Smote with bagging.In this method both the classes participate in creating each bag.A Smote oversamples the data with a% rate during every iteration and increased the rate with the multiple of 10 with every next iteration.This proportion characterizes the measure of positive data points which are arbitrarily resampled from the first data-set amid each iteration.The remaining positive data is generated by Smote algorithm till the data is balanced.Bootstrapping negative data points are created to make the ensemble more diverse.
2) Underbagging: This approach [65] does undersampling after creating subset of data from original data-set.Therefore, in place of removing data from the whole data-set, it does it before training each classifier.Undersampling is done by using nearest neighbor principle for balancing the data before training the classifier.
3) MSmotebagging: MSmoteBagging [65,67] is the variation of SmoteBagging wherein minority class is oversampled using MSmote data level method.Oversample minority class data points using MSMOTE preprocessing algorithm.Fig. 5. Bagging Algorithm.www.ijacsa.thesai.org4) Overbagging: In this method [65], data-set is balanced when the bags are randomly picked from the original data-set.Therefore, in place of removing the data randomly from whole data-set, the data is generated randomly within minority class of sub-set before every classifier is trained.This method includes all the majority class data points in the new bootstrap.
5) Underoverbagging: UnderBagging to OverBagging (UnderOverBagging) [65] method uses the combination of oversampling and undersampling process.It considers the resampling rate "a%" in every iteration which is ranged from 10% to 100%.Resampling rate is the multiple of 10.Therefore, the number of data points trained by every classifier in the subsequent iterations will be different.This method introduce diversity is the process.

6) Imbalanced ivotes (IIVotes):
IIVotes is the combination of SPIDER [68] and IVotes [69].SPIDER is the preprocessing method..IVotes is a variation of Bagging where the sampling is done according to the importance of each data point.Although SPIDER method improves the sensitivity of minority class but decrease the specificity at the same time.IIVotes modified SPIDER method by incorporating IVotes for improving the trade-off between specificity and senstivity.The main purpose of this method is to acquire a balance between the specificity and sensitivity for the minority class in contrast to a single classifier combined with SPIDER.

D. Ensembles based upon Hybrid Ensemble Concept (Bagging and Boosting)
Hybrid Ensemble based methods are the combination of bagging, boosting and pre-processing methods.Liu, Wu, and Zhou in 2009 proposed EasyEnsemble [53] and BalanceCascade [53] and named these methods as exploratory undersampling methods.These methods follow different approaches to tackle negative data points after every iteration.These methods used bagging as the key concept in building ensemble and used AdaBoost technique in place of the weak classifier.In BalanceCascade the classifiers are trained sequentially because it works in a supervised manner.During bagging iteration after the AdaBoost classifier is trained, the correctly classified majority data are removed from the dataset and is not processed in the next iterations.As EasyEnsemble approach does not execute any operation on the data from the original data-set after every AdaBoost iteration.So the classifiers are trained in parallel.

E. Performance Criteria
In case of imbalanced data-sets, the main objective is to identify the minority class so we are considering minority class as the positive class.Table I shows the confusion matrix for imbalance data-sets.We are using Area under the ROC Curve (AUC) [70,71] as the performance metric to assess the methods.AUC is a standout amongst the most famous execution metric used to assess the execution of classifiers intended for imbalanced data sets.It is a curve in which false-positive rate and true positive rate are plotted on x-axis and y-axis respectively.AUC is the finest tool for comparing different classifiers.A classifier"s performance is directly proportional to its location towards the upper left corner.AUC portrays ROC quantitatively.It is calculated as the arithmetic mean of True Positive rate and True Negative rate. (1) Where is characterized as the quantity of positive data points that are accurately categorised as positive and is the total quantity of negative data points that are accurately categorised as negative.AUC reveals the global performance of every classifier for all conceivable estimation of False Positive rate.

III. EMPIRICAL ASSESSMENT OF ENSEMBLES
We have compared 15 ensemble approaches with 7 imbalanced data with the class imbalance ratio from 1.82 to 129.44.The characteristics of these data-sets are recorded in Appendix A. We used KEEL tool [54,55] for comparing the performance of ensemble approaches by considering Decision Tree method (C4.5) as the weak classifier.C4.5 is the widely used classifier by many people to compare the algorithms in imbalance domains [72,73].The AUC of the methods is recorded with the following initial settings of the KEEL tool (Table II).Tables III and IV listed AUC values along with the variance.Results are visually displayed in Fig. 6, Fig. 7 and Fig. 8. Average performance of all the ensembles is shown in Fig. 9.

A. Visual Interpretations and Discussions
It is witnessed from the figures that for Boosting based approaches (Fig. 6), RusBoost stands out and outperformed every other method for extremely imbalanced data (Abalone19 having imbalance ratio of 129.44).In other cases, SmoteBoost and RusBoost almost performed equally.In case of Bagging based approaches (Fig. 7), Underbagging outperformed other methods for highly imbalanced data-set whereas the performance of SmoteBagging and UnderBagging is almost equal for other data-sets.Hybrid ensembles performed equally well for all the data-sets with minor differences for some data-sets.In case of Ecoli4, Balancecascade outperformed EasyEnsemble whereas in case of Abalone19, EasyEnsemble outperformed Balancecascade.Considering the overall average performance of ensembles, it is observed that RusBoost outperformed other ensemble methods.The performance of SmoteBagging, UnderBagging, BalanceCascade and SmoteBoost performed equally well with the minor variations.
The visual interpretation about performance of these ensembles is not satisfactory and sufficient.So to prove these interpretations, we have done statistical validations.

B. Statistical Validations
It is very difficult to judge the performance of algorithms when their performance is tested with multiple data-sets and best performing method is not the same for every case.Statistical validation is an efficient tool when we have to compare the performance of methods with very little variation.
To do better analysis we are using non-parametric tests as per the recommendation given in [72][73][74].We are conducting two types of non-parametric tests.We are using Friedman rank test [75] to compare multiple methods and to know if there are any significant differences between the methods.If the "Null hypothesis is rejected" then we are using Holm post-hoc test [75] to check if the control method (having rank 1) is significantly better than other methods (1 x N comparisons).This test computes ranks for every algorithm as per the following equation: Where is the total number of algorithms, ̂ is equal to the rank total of the i th data-set and ̂ is the rank total of the j th algorithm.As per the equation the best performing algorithm will have the lowest rank.To compare two methods, we are using Wilcoxon Matched Pair signed rank test [57] to find the significant differences between two methods.

1) Statistical framework:
We applied the statistical tests on the AUC performance metric as per following steps In the first step, Best performer method is selected from every group of ensembles (Boosting, Bagging and Hybrid) using Friedman test and Holm post-hoc analysis.After this step, we left with only three best methods out of all the groups.In the second step, 3 methods are assessed using Friedman test to find the final method which outperformed every other ensemble to classify imbalanced data.
2) Analysis and discussions: Firstly, we apply Friedman test on Boosting based ensembles.Fig. 10 shows the ranks assigned by Friedman test.As per the ranking, RusBoost outperformed in the family of Boosting ensembles whereas DataBoost-IM is the worst performer.In the table, "N" is the number of data-sets."k-1" is the degree of freedom (which is equal to number of algorithms minus 1).The table value of chi-square (χ2) test for "5" degree of freedom is 11.0705, which is lesser than the F AR calculated value 16.55102 and the p-value is less than 0.05.Hence the null hypothesis (There is no significant difference between these groups of algorithms) is rejected.To know the difference, we did Holm post-hoc analysis by considering RusBoost as the control method (having rank 1).Holm statistics is given in Table VI.As per the statistic, the hypothesis for no significant differences is rejected for DataBoost-IM, AdaBoost, MSmoteBoost and EusBoost with the control method "RusBoost" because the p-value is each case is less than 0.05.As the p-value of SmoteBoost is equal to 0.05, hence there are no significant differences between RusBoost and SmoteBoost.We further analyze these two algorithms using Wilcoxon Matched Pair signed rank test.The test statistics is given in Table VII.R + is the sum of ranks for the data-set in which the number of times first algorithm (RusBoost) outperformed other (SmoteBoost).R -rank specify the number of times second algorithm (SmoteBoost) outperformed the other (RusBoost).It is clearly seen from the table that RusBoost performed better than SmoteBoost.So RusBoost is selected as the best performer from the Boosting based ensemble group.Friedman Test ranking for Bagging based ensembles is shown in Fig. 11 and Test statistics are shown in Table VIII.SmoteBagging outperformed other ensembles with first rank and IIVotes is the worst performer.As chi-square (χ 2 ) table value for 6 degree of freedom is 12.5916 which is lower than chi-square (F AR ) calculated value and p-value is less than 0.05, the null hypothesis is rejected.To know the difference between these ensembles, Holm posthoc test is conducted with SmoteBagging as the control method.Table IX shows the Holm test statistics.All the methods except UnderBagging reject the null hypothesis, which means that we have to further analyze SmoteBagging and Underbagging for any significant differences.To closely analyze these two methods, we performed Wilcoxon Matched pair test.The test statistics (Table X) shows that p-value is more than 0.05 so null hypothesis for no significant differences is accepted.But the higher rank in favor of SmoteBagging proves its better performance compare to UnderBagging.Hence, SmoteBagging is selected as the best performer in the category of bagging based ensembles.As we have only two methods in hybrid ensemble category so we are performing Wilcoxon Matched pair test to analyze these methods.From the test statistics (Table XI), it is observed that the hypothesis is accepted as the p-value is more than 0.05 but the higher rank score of BalanceCascade confirms its superiority from EasyEnsemble.So, BalanceCascade is selected as the best performer from hybrid ensemble category.Next step is to analyze these three best performer methods.We again performed Friedman Test with these three methods.Ranks assigned by the test shows (Fig. 12) that RusBoost is the best performer and Balancecascade is the worst performer and Friedman Test statistic (Table XII) reveals that chi-square (χ 2 ) table value for 2 degree of freedom (5.9915) is less than calculated value (6.0), hence there are no significant differences between the methods.We further analyze the methods with Holm post-hoc analysis.Test statistics (Table XIII) shows that RusBoost and SmoteBagging are similar as the null hypothesis for no significant differences is accepted.As a last step to find the best performer out of all ensemble methods, we closely analyzed RusBoost and SmoteBagging with Wilcoxon matched pair test.Although, the p-value of the test statistic shown in the table (Table XIV) is more than 0.05, which means that there are no significant differences between these pair of methods but the higher rank value of RusBoost shows that its performance is better than SmoteBagging.Another advantage of RusBoost is that as it is using undersampling approach within the boosting process to classify the data-set so it is computationally less expensive compare to SmoteBagging which follows oversampling approach and bagging process for classification.From the visual interpretations and the statistical analysis, we can say that RusBoost outperformed other ensemble based methods in the imbalance domains.

IV. CONCLUSION
In the current study, we review various boosting and bagging based ensemble approaches for their performance in imbalanced domains by focusing on binary classification.We empirically assessed 15 approaches using 7 imbalanced data sets (KEEL repository) with the class imbalance ratio from 1.82 to as high as 129.44.After analyzing the results through statistical analysis methods (Wilcoxon matched signed rank and Friedman test), it is reported that RusBoost has outperformed other 14 methods considering any level of imbalance ratio.In future, we are planning to propose an ensemble approach which can work efficiently in the presence of other data impurities like noise, etc. along with data-set.

TABLE III .
AUC VALUES OF ENSEMBLE APPROACHES 8036 www.ijacsa.thesai.org Table V lists the Friedman test statistic for Boosting ensembles.

TABLE VI .
STATISTICS USING HOLM TEST FOR COMPARING BOOSTING BASED ENSEMBLES Control method: RusBoost (1.7143)

TABLE XIII .
STATISTICS USING HOLM TEST FOR COMPARING THE CANDIDATE METHODS FOR BEST PERFORMER ENSEMBLES