Self Organising Fuzzy Logic Classifier for Predicting Type-2 Diabetes Mellitus using ACO-ANN

In today’s digital world, a dataset with large number of attributes has a curse of dimensionality where the computation time grows exponentially with the number of dimensions. To overcome the problem of computation time and space, appropriate method of feature selection can be developed using metaheuristic approaches. The aim of this work is to investigate the use of ant colony optimization with the help of neural network to select near optimal feature subset and integrate it with the self-organizing fuzzy logic classifier for improving the recognition rate. The proposed fuzzy classifier derives prototype from the collected data through an offline training process and uses it to develop a fuzzy inference system for classification. Once trained, it can continuously learn from streaming data and later adapts the changing facts by updating the system structure recursively. The developed model is not based on predefined parameters used in the data generation model but is derived from the empirically observed data. Keywords—Ant colony optimization; feature selection; fuzzy logic classifier; self organizing; type-2 diabetes mellitus


I. INTRODUCTION
Machine learning algorithms are widely used in medical field. Several classification and clustering techniques are useful for disease prediction. Nowadays, we are living in the era of BIG-DATA, so classification algorithms have got importance in the research. Conventional classification algorithms are trained on static datasets. Once the classifiers are trained, no modification is possible in their configuration.
Most of the classifiers are developed when data is not available in the largescale. To overcome this issue, online approach can be used for developing classifiers. In this approach classifiers are constantly learning from new instances [1]. They also store key information and disposes already processed instances. Variation in the pattern in nonstationary situations is considered by evolving system structure and recursively modifying meta -data. Pattern may change when the data availability is continuously increasing, and conventional offline approaches will not consider this fact. However, it is not feasible to learn online from the beginning because setting the system with the presented static data in an offline manner can assure improved performance. SOF classifier is trained in two stages. The classifier is trained offline with the presented static data and fuzzy rule-based system is obtained in the first stage. During the second online training stage, the fuzzy rule-based system recognized over the offline training process will be restructured after the new instances are processed to track the possible changes in the data pattern [1] [2].
Our study is based on the development of predictive model for Type-2 diabetes Mellitus (T2DM). Diabetes is a chronic illness accountable for increasing number of deaths every year around the world. It typically progresses after years of insulin resistance, or prediabetes. Prediabetes can progress to diabetes when liver and muscle cells become more and more insulin resistant (they don't respond properly to this internal insulin signal) and have growing problems of increased sugar in the bloodstream. Physicians suggest that if prediabetes is recognized early its progression can be halted. As per physician's advice, vegetables and fruits should form the bulk of the diet, losing weight and exercising are the best ways to stop prediabetes from becoming Type 2 diabetes. If prediabetic stage is not treated it can lead to type 2 diabetes and complications like heart disease, renal disease and stroke. Machine learning algorithms can be used to develop a predictive model for T2DM [3]. If prediabetic stage is detected on time, then change in lifestyle can prevent the diabetes.
I have selected this topic because the current approaches have often classified the patients as either having diabetes or a healthy individual, ignoring the prediabetic state of a patient. This leads to increased incidence of diabetes if no proper preventive measures are taken on time. It can be curbed at the prediabetic stage by introducing lifestyle changes.
Most of the work available in the literature has been done on PIMA dataset of the UCI repository. This dataset has 8 attributes and only two labels (Healthy / Diabetic). There is no mention of prediabetic stage and hence I have selected this topic.
Literature review shows that most of the developed traditional classifiers are offline [3] [4] [5] Once they are trained on dataset which is static in nature then no further modification is possible in the structure of classifier model. These traditional methodologies require users to predefine various kinds of parameters to obtain promising result. In real cases this prior knowledge may not be available. The recent study shows that most of the work has been done on offline data processing and retraining is required when data pattern changes. In our study, we have overcome this limitation and 348 | P a g e www.ijacsa.thesai.org taken into consideration changing data pattern while developing the predictive model. This paper is presented by writing five sections. Section one gives the introduction of reasons behind the selection of study. Section two discusses the empirical data analytics operators in brief. Section three discusses the feature selection method. The pseudocode of stage 1 (static offline training) and stage 2 classifier (self-evolving training) has also been discussed in section three. Section four comprises of the analysis and experimental results. Section five contains conclusion of the experimental study.

II. NOTIONAL BASIS
Below mentioned statistical calculations are required for the proposed method [2]: 1) Cumulative proximity, 2) Unimodal density, 3) Multimodal density, 4) Along with these proximity and density calculation, we need to compute their recursive forms for streaming data processing.
a) Cumulative proximity-The cumulative proximity between two data points is computed by the equation 1.
Where, d (x i , x j ) denotes the distance between two points which can be measured as Euclidean, Makowski or Cosine. b) Unimodal density-This specifies the main data pattern in our proposed method and computed by the equation 2.
c) Multimodal density-This is calculated by the equation 3 where, f i specifies the corresponding frequency of occurrence.
d) Recursive computation of densities -The recursive computation has a substantial role in the second stage of developing SOF classifier. We get well-designed recursive calculation forms, using the equation 4 where quantities can be updated by an effective means by storing only the key meta-parameters.
Where, recursive definitions of global mean , covariance matrix ∑ and are computed by equation 5, 6 and 7, respectively.

III. PROPOSED METHODOLOGY
In this experimental study we have investigated the use of ant colony optimization with the help of artificial neural network (ACO-ANN) to select near optimal feature subset and integrated it with the self-organizing fuzzy logic classifier for improving the recognition rate.
In this study, we have proposed Ant Colony Optimization method for feature selection [4] and ANN is used for implementing fitness function. The obtained feature subset is then further used for developing SOF classifier. The architecture diagram of proposed system is depicted in Fig. 1. Literature shows that there are various machine learning algorithms [6], metaheuristic algorithms [7] [8] and fuzzy logic methods [9] which can be used for feature selection. Based on literature survey feature selection using nature inspired algorithms Ant colony optimization gives higher accuracy [5]. ACO is inspired by the behaviour of ant colonies. When ants go for searching food, they deposit pheromone on the path. This odorous substance is used as a communication medium. The quantity of the placed pheromone depends on the distance of the food source. Other ants moving at random detects a laid pheromone and most likely the ant follows the same path. These ants will also place the pheromone. Consequently, the path which is used by more ants will be followed. Probability of path which ant decides to choose increases with the number of ants previously followed that path. An artificial ant builds solutions to an optimization problem. Hybrid algorithm was developed to select prominent features using ACO. An artificial ant can be used for selecting subset of features. Ants traverses the node which represents features to construct a subset. All the ants attempt to construct a subset. The conventional probabilistic transition rule for selecting features is used. The node is chosen based on the probability which is computed by equation 8.
Where, denotes rate of evaporation, m denotes number of an artificial ant and Δ is pheromone laid on path (i, j) by k th artificial ant. Equation 10 computes Δ .
349 | P a g e www.ijacsa.thesai.org Where, Q is constant, and f refers to the length of the tour by k th ant. Integrating ACO with Artificial neural network (ANN) so that important features are extracted and then SOF is employed for classification. ANN has been employed for developing fitness function of ACO [8].
SOF is implemented in two stages. SOF will learn from the static dataset available at the beginning and recognize prototypes from each class independently to form 0-order fuzzy rule created on the acknowledged samples per class [1]. There is no impact of training processes of one class on another. The training process is led on data examples of the c th class (C = 1,2, 3, …, N). Prototypes are recognized based on the distributions of the data samples and its density.
The data examples are ordered by their mutual distances and values of multimodal density and stored in{r}. Let r 1 be the highest multimodal density. r 2 is recognized as the instance with the minimum distance from r 1 and r 3 is identified based on the minimum distance from r 2 . Reiterate in this way and build the list {r}. Multimodal densities are ordered and specified by � ( )� . Prototypes denoted by{P} 0 are then recognized as the local maxima of the ordered multimodal densities. Centers of the data cloud is computed and denoted by { } 0 . { } i ℎ ) denotes the collection of the centers of the adjacent data clouds. After identifying all the representative prototypes of the c th class fuzzy rules are constructed.
During the second phase of training, the classifier continues to update its configuration on a sample basis when new data is provided. Like the offline training phase the set of fuzzy rules of different classes are adapted. When the K+1 th sample of c th class is provided then the meta parameters +1 , +1 and ∑ +1 are computed by equations 5, 6 and 7, respectively.

IV. ANALYSIS AND EXPERIMENTAL RESULTS
For this study the dataset has been collected from the local hospitals. The dataset is having 32 features and 1071 instances. Class variable has three values as Diabetic, Prediabetic and Healthy patients. All the features are not useful for diabetes classification. ACO is used for selecting important features and objective function is developed using ANN for selecting near optimal feature subset. The selected features from the above ACO-ANN hybrid model are used for developing the Self organizing classifier. The proposed method uses Euclidean distance and cosine similarity for the implementation of classifier [10]. The dataset has been divided into three parts equally. Classifier conducts offline learning on the first part of static data. Then the proposed classifier conducts online learning from streaming data from second part of the collected dataset. classifier Third part is reserved for validation and testing purpose on unseen data. The developed classifier performs validation on testing data. ACO has selected Family History, Eating Fruits/Vegetables, PPG, FPG, Age, Feeling Hungry, Exercise, Frequent Urination, itchy skin and gender as important features in the diabetes detection. In this work values for ACO algorithm parameters are as per the Table I. Confusion matrix on validation data for Euclidean distance and cosine similarity is shown in Fig. 2 and Fig. 3, respectively. The accuracy obtained is 86.27% in diabetes detection when distance type is Euclidean distance and 80.67% in the case of cosine similarity.
We have obtained the overall values for estimating the performance of the proposed classifier. Overall values of parameters obtained for the developed model are given in Table II. In our experimental study we have computed Accuracy, Error, Sensitivity, Specificity, Precision, False positive rate, F1 score, Matthew Correlation Coefficient and Kappa values to measure the performance. Users of the developed classifier would require the results of prediction. Practically, only the probability of correct predictions is not sufficient. Hence the set of statistical measures are computed to describe the developed classifier performances in various aspects. The following four terms are necessary to measure the various performance metrics: • True positive (TP) -The data rows belong to the positive class and has been correctly predicted.
• False positive (FP) -The data rows belong to the negative class and has been incorrectly predicted as positive • True negative (TN) -The data rows belong to the negative class and has been correctly predicted.
False negative (FN) -The data rows belong to the positive class and has been incorrectly predicted as negative. The below mentioned measures qualify the performance of the prediction model and are used in medicine to measure the accuracy of analytical procedure.  (1 − ) = = + 7) F1_score: It combines precision and recall using harmonic mean. Harmonic mean is used precision and recall both are expressed as proportions between zero and one. The worst F1 score value is zero and best is one. The Cohen's kappa coefficient is a statistical measure of interrater reliability. This coefficient is used to evaluate the classification accuracy. But it is normalized at the baseline of random chance on your dataset. In multi-class and imbalanced class problems, Cohen's kappa coefficient provides a good measure. Any kappa value less than 0.60 specifies inadequate agreement among the raters and less confidence should be shown in the study results.
Multi-Class Confusion Matrix Output for the developed predictive model having distance type measure as Euclidean distance and cosine similarity is given in Table III. In this work, we have used metaheuristic approach for feature selection. ACO has been implemented to decide the features which can predict diabetes with higher accuracy. The proposed SOF Classifier algorithm does not depend on prior assumptions about the data generation model. But the proposed classifier learns from the static offline data and develops a prototype of Trained classifier. In the next stage the classifier conducts online learning from streaming data. The approach provides various types of similarity and distance 352 | P a g e www.ijacsa.thesai.org measures. We have used Euclidean distance and cosine similarity as distance measure for evaluating the performance of the developed model. It has been observed that accuracy has been 86.27% when the distance mode is Euclidean distance. The observed accuracy was 80.67% in the case of cosine similarity. If the person has been classified as prediabetic by the developed predictive model, then consult your doctor for further treatment. Early intervention has been revealed to delay, and in some cases prevent, the progression from prediabetes to diabetes. The proposed method will help the individual from developing T2DM.
As future work, the developed model shall be considered for the extension of more metacharacters in the construction of stage1 classifier and for other classification problems. Further application of first order fuzzy rules in the SOF classifier will increase the degrees of freedom and therefore enhance the efficiency of the developed model. Different types of distance metrics like Manhattan, Minkowski, Hamming distance will be used for the implementation of classifier and comparison with the existing result.