A Machine Learning based Fine-Tuned and Stacked Model: Predictive Analysis on Cancer Dataset

The earlier forecast and location of disease cells can be useful in curing the illness in medical applications. Knowledge discovery is having many significant roles in health sector, bioinformatics etc. Plenty of hidden information is available in the datasets present in the various domains like medical information, textual analysis, image attributes exploration etc. Predictive analytics and modeling encompasses a variety of statistical methodologies from machine learning that can analyze the present along with historical facts to make the predictions about the future events. Breast cancer research already has involved with the good amount of progress in recent decade, but due to advancement in technologies, there is still some possibilities for an improvement. In this paper, the fine-tuned and stacked model procedure is presented which is experimented on standard breast cancer dataset. The obtained results show the improvement over stateof-the-art algorithms with improved performance parameters e.g. disease prediction accuracy, sensitivity and better F1 score etc. Keywords—Machine learning; Cancer prediction; Data mining and Knowledge discovery; Supervised learning; Neural Networks


I. INTRODUCTION
Over past decades, the life-style of individuals has modified underneath the various factors i.e. changes in the nutrition thing, utilization of the artificial agents or varied technological enhancements.It incorporates a negative as well as positive side.On one aspect, life is easier; there are new opportunities to carry out totally diverse kind of human motions etc.On another aspect, various types of diseases have increased and that require new varieties of treatment phenomenons to enhance a high quality of health care and the all in one solutions for associated well-being.Breast cancer [6] is one in all the foremost common cancer in conjunction with respiratory organ and cartilaginous tube cancer, glandular cancer and carcinoma among others.The employment of information science and machine learning approaches [11] [13] in medical fields proves to be prolific as such approaches are also thought of nice help within the higher cognitive process of medical practitioners.Conventional strategic movements are hardly tolerating while facing with the multivariate natured data.The earlier forecast and location of disease cells with better accuracy can be useful in curing the illness in various medical / healthcare applications.

A. Related Work
M. Bruijne given a machine learning approach and framework [1] for disease detection and diagnosis for medical and healthcare data.Sherafatian [2] in his work, given a concept, tree-based procedures which is indicated as the minimal length subset of miRNA for the disease diagnosis.H. Wang et al. [3] proposed an ensemble procedure based on support vector regression for the cancer disease diagnosis.C. M. Lynch et al., in their work [4], have proposed a procedure for lung cancer patient's predictive analysis and their survival scenario chances via the supervised ML methods.Sommen et al. [5] given a method for predictive features for early cancer detection scenario.Wassan et al. [6] surveyed the various machine learning methods in Bioinformatics domain.N. Khuriwal et al. [7] proposed a framework for breast cancer diagnosis using adaptive voting ensemble machine learning algorithm.Ali et al. [8] compared two techniques i.e.SVM and Neural semantic circuit networks cancer contamination diagnosis.They did experiments with the variety of kernel functionalities for support vectors hyperplanes e.g.mpl (with 86% accuracy), radial basis function (89%), quadratic (88%) furthermore polynomial (approx 88%).Shajahaan et al. [9] utilized "Wisconsin Breast Cancer Database" for contamination analysis within variety of learning rules like Random Forests (RF), C4.5 etc. Chaurasia et al. [10] employed WEKA environment with 10-folds cv procedure for the algorithms like -k-NN, Best First randomized nodes forests, SMO etc. Authors of [11] in their work, performed a comparative analysis among Radial Basis Function kernel circuitary, Multilayer Perceptron circuit nodes forest along with the canonical logistical marking regression algorithmic procedures.Abed et al. [12] contemplated a categorization method of hybrid nature for cancer infected cell's interpretation and analysis.Ivankov et al. [13] given an extensive comparison of some significant and practically used machine learning routines in the binate natured classification elucidation.

B. Research Contribution
This paper contributes as follows: • First, the state of the art scenarios and developments, results in the probem domain are reviewed.The limitations in existing methods drove the development of proposed solution for prediction model on cancer dataset.
• Then the experiments are performed on Breast Cancer Wisconsin (Diagnostic) Data Set [14] and further the novelty of proposed methodology through comparison results of existing approaches is proved on the same dataset.
The remaining paper is structured as -Section 2 elaborates about some significant definitions and preliminaries.The proposed idea and detailed procedure is given in section 3. Experimental results along with comparative performance analysis are given in section 4. Finally, conclusive summary is given in section 5.

II. PRELIMINARIES AND DEFINITIONS
This sections presents some definitions and preliminaries.

A. Neural Networks Architecture
The basic design and working functionality of this classifier is somewhat equal to the human brain (concluded in lot of neurological studies).In the point of view of structure, its flexible accordingly to the task user needs to perform I.e.high dimension diminishing, localization, regression, categorization etc.First, based on input training data and learning algorithm, the classifier model is trained.The structure consists of i/p layer, hidden (middle) layer and o/p layer.[15] Here, the training process is more time consuming and it can perform multiclass classification.The accuracy of this classifier sometimes degrades due to less effective preprocessing and presence of missing values etc.

B. Support Vector Regression
This falls under the category of supervised learning mechanism where training and testing both phases are present.This classifier is based on the phenomenon of n dimension planes and hyperplanes.The n value depends on the total number of classes and the features lies on the planes spaces as separate data points.Here the task is to find support vectors/hyperplanes, which separate the data points into various available categories.[16] The separating hyperplanes can be linear in nature or nonlinear depending upon input problem.Here, the functionality of chosen kernels, variance plays significant role in output performance parameters.

C. Fuzzy SVM
If the attributes values and classes in the data are not discrete in nature means they are continuous natured then fuzzy logic is utilised in the support vector procedure.[17,18] Some data which includes noise elements into it, should also be processed through fuzziness logic.

D. Bayesian Classifiers
As the metadata collection, some statistical and probability computations are first associated here.[19] To find the certain correlations among features, Bayesian and naive independent procedures are utilised.This classifier was invented in 1950s and since then it is being used with lot of variations in terms of correlations scanning, imputed variable predictors, dynamic features induction, non-linearized learning etc.

III. PROPOSED METHODOLOGY
This section presents the core idea and detailed algorithmic procedure.

A. Core Idea
Here, the curse of dimensionality and curse of multivariate nature of data has been dealt.This research contribution tried to achieve a computationally efficient dimension reduction and further classification based disease prediction procedural framework with comparatively improved and high accuracy.The detailed procedures are given as Algorithmic frameworks 3.2.1 and 3.2.2 in below sub-section.In algorithm 3.2.1,dimension reduction is performed as a pre-processing phase to deal with the curse of dimensionality.The dataset is converted into lower dimensional space (projection of a feature space into a smaller subspace), which is aimed to get rid of overfitting and perform reduction in computational cost.Further in algorithm 3.2.2,fine-tuned neural network with stacking is applied for disease prediction analysis.

B. Procedure Steps
where, C n : total no. of different classes for the dataset.
2: Compute → scatter matrices in two aspects -Within-class matrix: Use eq.SMAT W = c i=1 S i ; where, Between-class matrix: Use eq.SMAT B = c i=1 N i (m i − m)(m i − m) T ; where m, m i and N i are -overall mean, sample mean and sizes of respective categories.

IV. EXPERIMENTAL ANALYSIS
This section presents an extensive experimental analysis.First the experimental envoirnmental setup, input dataset and its properties are discussed.Next, experimental output results are given.Later, in comparison table, the results of proposed procedure with significant existing approaches are compared.

A. Experimental setup
System specifications (Software and Hardware) used are as follows -OS: Ubuntu 16.04 LTS, 64 bit is used; hardware

C. Output Results Discussion
The experiments are performed on input dataset utilizing the proposed procedure.First the algorithmic procedure 3.   Graphical representation in shown as Fig- 3. Better test accuracy as 98.8% is obtained along with other statistical performance parameters for disease prediction model performance i.e.P-value as 2e-16 (P-value represents, asymptotic significance for the model), Cohen's Kappa coefficient as 0.97, Sensitivity and Specificity as 0.98 and 0.99 respectively.

D. Comparative Performance Analysis
This section presents the comparative analysis where the obtained experimental results are compared with some significant existing state of the art methods on the same dataset.The comparison is given in Table-2.From comparative analysis, it  is proved that the proposed procedure outperforms over other existing approaches.

V. CONCLUSIVE SUMMARY
Efficient preprocessing mechanisms to make the intelligent learning and prediction systems capable of dealing with multivariate data and effective learning technologies to find out the rules to describe the data are still of urgent need.Some limitations in the existing methods motivated us to propose a machine learning based fine-tuned and stacked model, utilizing which the experimental analysis is performed on cancer dataset in an computationally efficient manner with improved disease prediction accuracy.

3 : 4 : 6 :Algorithm 2
Compute → Eigenvectors (ev 1 , ev 2 , • • • , ev d ) along with the corresponding Eigenvalues (λ 1 , λ 2 , • • • , λ d ) for scatter matrices computed in step 2. Make the tuples list and perform sorting of Eigenvectors with decrement in Eigenvalues.5: k Eigenvectors are chosen with the largest of Eigenvalues and construct the d × k dimension matrix M (here, each column denotes an eigenvector).Transform samples into new subspace by using d × k matrix i.e. -Q (n×k) = P (n×d) × M (d×k) Here, P denotes: n × d sized matrix possessing n samples, Q denotes: transformed n × k sized samples in reduced new feature-subspace.Further, neural networks with MLP (multi-layer perceptron) architecture is used with finetuning for classification based disease prediction.Neurons are seperated in form of input layer, hidden layer and output layer.Steps involved in the procedure are given as algorithm 3.2.2-Learning rate is a configuration parameter, used to control the amount of weights that need to be updated in model procedure.Steps 1-4 are forward neural propagation and steps 5-11 are backward neural propagation.Algorithmic procedure 3.2.21: Consider, X: as i/p matrix and Y: as o/p matrix.Initialize → assign random values to weights and biases.Consider, w hl : weight matrix for hidden layer b hl : bias matrix for hidden layer w ol : weight matrix for output layer b ol : bias matrix for output layer 2: Perform linear transformation as -Input hidden layer = matrix dot product(X, w hl ) + b hl 3: Using Activation function (sigmoid), perform the nonlinear transformation -Activation hidden layer = sigmoid(hidden layer input) sigmoid function will return output as = 1 1+e −x 4: Performo/p layer input = matrix dot product(hiddenlayer activations × w ol ) + b ol o/p = sigmoid(o/p layer input) 5: Prediction is compared with actual output and calculate gradient of error.Error(E) = Y -output 6: Compute -Slope o/p layer = derivatives sigmoid(output) Slope hidden layer = derivatives sigmoid(hiddenlayer activations) 7: Compute delta ( ) at output layeroutput = E × Slope o/p layer 8: Error back propagation is done as -E hidden layer = matrix dot product( output , w T ol ) 9: Computehidden layer = E hidden layer × Slope hidden layer 10: Update weights in networkw ol = w ol + matrix dot product((hiddenlayer activations) T , output )× value learning rate w hl = w hl + matrix dot product(X T , hidden layer ) × value learning rate 11: Biases updated asb hl = b hl + sum( hidden layer , axis = 0) × value learning rate b ol = b ol + sum( output , axis = 0) × value learning rate

Fig. 1 .
Fig. 1.Attributes correlation matrix 2.1 is executed, and reduced dimensioned variables' sub-space is obtained.The attributes / variables correlation matrix is shown as Fig-1.Disease categorization for two classes of the dataset i.e.Benign(B) and Malignant(M) density distribution plot is given as Fig-2.Resultant confusion matrix after execution of the disease predictive analysis model is as follows -

TABLE I .
EXPERIMENT RESULTS

TABLE II .
COMPARATIVE ANALYSIS