Population based Optimized and Condensed Fuzzy Deep Belief Network for Credit Card Fraudulent Detection

In this information era, with the advancement in technology, there is a high risk due to financial fraud which is a continually increasing menace during online transactions. Credit card fraudulent identification is a toughest challenge because of two important issues, as the profile of the credit card user’s behavior changes constantly and credit card datasets are skewed. The factors which greatly affects the credit card fraudulent transaction detection are primarily based on data sampling models, features involved in feature selection and detection approaches implied. To overwhelm these issues, instead of using certainty theory, this paper encapsulates with three different empowered models are deployed for intellectual way of fraudulent transaction detection. In this work uncertainty theory of intuitionistic fuzzy theorem to determine the significant features which will influence the detection process effectively. Maximized relevancy among dependent and independent features of credit card dataset are determined using grade of membership and non-membership information of each features. The intuitionistic fuzzy mutual information with the knowledge of entropy it selects the features with highest information score as significant feature subset. This proposed model devised Fuzzy Deep Belief Network enriched with Sea Turtle Foraging for credit card fraudulent detection (EFDBN-STFA). The fuzzy deep belief network greatly handles the complex pattern of credit card transactions with its deep knowledge and stacked restricted Boltzmann machine the pattern of dataset is analyzed. The weights assigned to the hidden nodes are fine-tuned by the sea turtle foraging using its fitness measure and thus it improves the detection accuracy of the FDBN. Simulation results proved the efficacy of EFDBN-STFA on two different credit card datasets with its gained ability of handling hesitation factor and optimization using metaheuristic approach, it achieves higher detection rate with reduced false alarms compared to other existing detection models. Keywords—Credit card fraudulent; uncertainty; intuitionistic fuzzy; fuzzy deep belief network; sea turtle foraging


I. INTRODUCTION
In modern days, usage of internet for commercial transactions started increasing exponentially because of its availability and flexibility. Usage of credit cards for online or offline transactions is very useful for business peoples [1]. But credit card fraud becomes significant issue in financial sectors, banks and card issuers. This credit card fraud detection becomes an important and interesting topic of research for the scientific community. To handle such voluminous transaction system, it needs high sophisticated security system to analyze transactions and detect fraud transactions more quickly. This necessitates with the advantage of modern technology utilizing machine learning, mining and artificial intelligence influence's the process of credit card detection more accurately. Detection of fraud has become an important activity which involves in reducing the impact of fraudulent transactions [2]. But credit card fraud detection is very challenging from the perspective of learning process due to its nature of class imbalance.
Generally, classification models are deployed to examine all the approved transactions and alert the utmost suspicious ones. The alerts are investigated by the professionals and intimate the cardholders to discover whether it is genuine or fraudulent transactions for each altered transaction [3]. This provides feedback to the classification system during the training phase to update itself and detect the fraud detection more accurately. Meanwhile, it is very essential to highlight the major difference among user behavior and fraud analysis. The fraud detection models extract the signature of fault tricks pattern and it greatly assist during the testing process. The ultimate goal of this work is to reduce the false detection of fraudulent detection in a more precise way. This is achieved by developing a Population based Optimized and Condensed Fuzzy Deep Belief Network for Credit Card Fraudulent Detection.
The rest of the research is organized as follows: Section I gives the importance of detection of frauds, Section II highlights the related works done before. Section III describes the methodology of the proposed work. Section IV represents the conceptual framework used and Section V gives the results and its discussion. Followed by the conclusion and references.

II. RELATED WORK
This section discusses about some of the existing work which involves in credit card fraudulent detection using machine learning algorithms and mining approaches. www.ijacsa.thesai.org which analysis past history of customers transaction details. The behaviour patterns of transaction are extracted and card holders with same patterns are clustered depending on their amount of transaction. They used sliding window concept and transactions are aggregate to determine fraudulent and genuine transaction.
Imane et al. [5] developed a comparative model which comprised of various approaches of machine learning models are deployed for credit card fraud detection. The authors mainly focused on investigating neural network performance. They stated that this study aims to guide the researches to choose best approaches for credit card fraud detection.
Wen-Fang et al. [6] presented an outlier mining model to accurately forecast fraudulent credit card transaction. Distance summing algorithm is used to emulate variation among normal and fraud detection. The outlier approach is mainly used to detect anomalous transactions.
Maniraj et al. [7] devised a recognition model to check weather a new incoming transaction normal or fraudulent. They performed preprocessing and analyzed PCA converted credit card transaction data. They deployed isolation forest model and local outlier detection for classifying multiple type of anomaly detection.
Andrea et al. [8] in their work contributed three different approaches to discover fraudulent transactions. They handled class imbalance problem by designing a novel learning policy. They worked with real time dataset with the concept of drifting and verification.
Navneet et al. [9] investigates fraudulent transactions in banking sectors and analyzed vulnerabilities during online transaction. This work explores possible ways to prevent fraud transaction by developing graph database and finding the patterns of fraud transaction.
Salvatore et al. [10] in their work stated that understanding the purpose of meta learning policies will greatly influence during the process of fraud catching rate and deduction rate. They used skewed distribution to work with balanced data during training that results in better classification. Dheepa and Dhanapal [11] developed a behavioral classification model using support vector machines. The features are extracted to determine the significant behavioral transaction patterns. If there any conflict occurs then it is examined as suspicious and this is considered to discover the frauds.
Chuang et al. [12] presented a mining model, which uses web services on online bank transaction. The banks which are involved in these scheme shares their knowledge about fraud patterns in a heterogenous environment and with the distributed system it further improves the ability of fraud detection with less financial loss.
Tao Guo et al. [13] developed a neural network model to discover customer's behavior pattern. The significant task is to discover any deviation from the usual transaction pattern. This is archived by training the neural network with dataset and their confident value is computed. Those credit card transactions with less confident value is treated as fraudulent transaction.
Suvasini Panigrahi et al. [14] in their work designed a fusion model which comprised of four components like dempster-shafer, rule-based filter, transaction history and Bayesian model. Rule filter is used to determine fraudulent transactions. DST is used to compute belief value of each transaction based on its evidence value.
III. METHODOLOGY OF POPULATION BASED OPTIMIZED AND CONDENSED FUZZY DEEP BELIEF NETWORK FOR CREDIT CARD FRAUDULENT DETECTION This proposed work aims to overcome the ambiguity, vagueness and uncertainty in prediction of credit card fraudulent detection shown in " Fig. 1". This work used two different datasets where one is collected from Kaggle repository and another dataset is collected with a case study of a specific bank. The existing models uses the neural networks, support vector machine, random forest and other conventional classification models to perform this fraudulent detection process. Most of them fails to concentrate on handling vagueness and inconsistency which often arise in the real time dataset when there are transactions which cannot be finitely defined either as fraudulent or normal transaction. To overcome this problem the proposed model works in two stages, in order to reduce the redundancy among features involved in fraudulent detection and increase the relevancy among feature and class. This is achieved by adapting intuitionistic fuzzy mutual information as feature subset selection, whose ultimate goal is to choose the most significant attributes involved in process of prediction. The pattern of credit card transactions is analyzed in depth by using fuzzy Deep belief network which is fine tuned by introducing sea turtle foraging algorithm which optimizes the assignment of weights in FDBN.

A. Dataset Description
This work used two different credit card datasets for fraudulent transaction detection. The first dataset comprised of 284,807 transactions [15]. The input variables are in PCA transformation and they are denoted as V1, V2, ... V28 vector values. Other variables are time, amount is not transformed and class is a feature which is considered as a target variable. The second credit card dataset comprised of 30,000 transactions with 25 features including the class variable [16]. The features of this dataset are limit value, gender, education, marital status, age, payment and billing details. This proposed work uses these two credit card datasets to discover the fraudulent transaction.

B. Normalized Intuitionistic Fuzzy Mutual Information
A searching procedure which selects a subset of features which greatly influence the classification process is known as feature subset generation. This method applies a subset evaluation function to assess the current subset, if the present subset performs better than the previous subset, the current subset is replaced with the pervious set [17]. This is a cyclic process which repeats the subset generation and evaluation until termination condition is met. The termination condition relies on both evaluation and generation function, the former www.ijacsa.thesai.org case the iteration terminates when the insertion or deletion of an attribute doesn't produce a better subset. In later case until a predefined number of attributes are selected or specified number of iterations are done.
This research work develops a normalized intuitionistic fuzzy feature subset selection scheme, which starts with an empty feature subset E. Consecutively, each feature is selection in such a way that it maximized the criteria of evaluation and add the relevant feature to E. The selected feature subset is evaluated based on the minimum redundancy and maximum relevancy principle. Each features relevancy is measure using Intuitionistic Fuzzy mutual information (IFMI) between feature fr t and the class variable fr cl. The feature's redundancy is evaluated by finding IFMI between fr t and the subset of previously selected features which are in the E list is computed. To overcome the biased nature of multivalued features this work uses Normalized Intuitionistic Fuzzy mutual information (NIFMI). The NIFMI among two features fr s and fr t is computed by finding the ratio between the intuitionistic fuzzy mutual information IFMI(fr s : fr t ) of the two attributes and the minimum entropies of those two attributes (H(fr s ):H(fr t )). Likewise, Normalized Intuitionistic Fuzzy mutual information is defined as.

NIFMI (fr s ,fr t ) =
(1) Let F be the list of attributes of a dataset, which is represented as F = {fr 1, fr 2, fr 3, fr 4 , … fr n } where n denotes number of attributes. Let us defined that C and D are two Intuitionistic fuzzy sets defined on the fuzzy sets Y. The Intuitionistic fuzzy membership value of R th feature for i th class represented as  i,r , degree of non-membership value is i,r and its degree of hesitation is represented as π i,r . The membership value  i,r is computed as The non-membership value is computed as The indeterminacy value Where q is the intuitionistic fuzzification coefficient, > 0 is used to evade distinctiveness, denotes standard deviation while performing distance calculation [18]. ̅̅̅ signifies attributes mean value which belongs the class i. The radius of data d is denoted as d =max( ‖ ̅̅̅ ‖ ). The intuitionistic fuzzy entropy (IFE) of the fuzzy sets C and D is computed as follows: IFE(CD) = ∑ where refers to the membership of each instances of credit card dataset towards normal transaction as in " Fig.  2". A sort of Deep neural network which comprised of multiple layers of belief network known as deep belief network. In this model each layer is a Restricted Boltzmann Machine (RBM) which are stacked to each other and constructs deep belief network. DBN consist of two various types of networks they are belief network and restricted Boltzmann machine [19]. A Belief network is comprised of layers of stochastic binary units whose connections are weighted. This network is acyclic graph which permits to observe the kind of data the belief network believes. It adjusts the weights of the states between these units so that the network can produce appropriate result. The binary units in belief networks have either the state 0 or 1.
The initial process of DBN is to learn a layer of features of the visible units with contrastive divergence method. Next, to treat the activations of previously trained features as visible units and learn features of features in a second layer. At last, the entire DBN is trained when the final hidden layer finished its learning process. The greedy learning approach is used for training the DBN, because while training RBM with CD for each layer it falls under local optimum and the next stacked RBM layers takes those trained optimal values and look more local optimum. Finally, all the layers are consistently involving for local optima it gets its global optimum.
As shown in the " Fig. 3", Restricted Boltzmann machine is a recurrent neural network which consist of binary units and undirected edges among units. The probability distributions on visible and hidden units are termed with its function of energy. The functions are formulated as follows: where Vs refers to the visibile node and Hd refers to Hidden nodes , va denotes visible bias, hb signifes the hidden bias of the final layer and finally wt refers to the weight value between the previous layer and the present layer.
Like logistic regression, the conditional probabilities Prob(Vs(i)= 1|Hd) and Prob(Hd(i)= 1|Vs) and when a hidden vector Hd(Hd1, . . . , Hdj, . . . , Hdm) is known, the activation probability of the ith visible unit can be computed as follows: Similarly, Activation probability of j th hidden unit can be formulated when a set of visible vector vs(vs 1 , . . . , vs i , . . . , vs n ) is represented as:   is a sigmoid function, wt i,j is the link weight among the i th visible unit and the j th hidden unit and the hb j is the hidden bias of the j th hidden unit.
To maximize the joint probability of the bunch of training inputs then it is signified as in "equation (16)".
∏ (16) where, Vs is set of all training inputs of dataset.

D. Sea Turtle Foraging Algorithm
This algorithm is inspired by the sea turtle's food searching behaviour. The sea turtle senses a kind of odor smell known as dimethyl sulfide came from their sources of food and they move towards the food source which gives out the strongest odor [20]. Ocean current also helps the turtle for their movement. The artificial foraging process of sea turtle is discussed in the subsequent steps  (17) where, i = 1 to N number of turtles and D refers to D dimensional searching space which is contentious 3. Generate the initial velocities of turtles in a random manner Vel i (0) = [ ] and the velocity is controlled with in the predefined boundaries as follows: Vel min = -Vel max (19) where, TUB and TUL are the upper bound and the lower bound of the D dimensional search space of the turtle.  is the constant variable with the value ranges from 0 to 1.

Initial position of M food sources generated randomly
where, j = 1 to M food sources which has D dimensional space for searching it.
5. Initial position of each source of food is given as the input into the objective function and estimate it to get the fitness value of that source of food.
6. Assign the position of each turtle as objective function and estimate it to get the fitness value of the concern turtle, turtle with the highest fitness value is recorded as I where, is the turtle i's fitness value at time t.
7. Velocity of each turtle is updated as shown: where refers to the turtle position at time t and signifies the turtle position at time t-1.

Compute the ocean currents velocities by
9. Sum the turtle velocity to that of ocean current velocity to get the united velocity 10. If the turtle's fitness value is less than that of the food source, then its contribution of food source (CFS) is represented as where, refers to the food source j th fitness value.
11. Determine the distance among the turtle and the food source using the formula Dist ij = ||Pos i -Fs j || (26) 12. Compute the level of odor of the food source j seeming by the turtle i C ij (t) = (CFS j * exp-( ) where, (t) indicates the level of fading of odor with shortlived time, 0 is a persistent set to be equivalent to 1, and T is referring to iterations denoting the longest time the odor completely disappears.
13. Discover the turtle i's best food source, which has the highest value of C ij compared to all other food sources available.
14. Update each turtles position as follows: 15. Stop the process if the maximum no of iteration is reached or else go to the step 6. www.ijacsa.thesai.org

IV. CONCEPTUAL FRAMEWORK
The proposed EFDBN-STFA credit card fraudulent transaction detection is deployed using python code. The performance analysis is done on two different credit card datasets. The other detection models used for comparison are Aggrandized Random Forest (RF), Aggrandized Kernel based Support Vector Machine (SVM) and Artificial Neural Network (ANN). The evaluation metrics used for examining the performance of the detection models are done using accuracy, precision and recall measures.
Accuracy: It is defined as the ratio of transaction which are correctly predicted to the total credit card transactions. This measure instantly specifies how well a model is trained.

Acc=
Precision: This metric is defined as the ratio of correctly predicted fraudulent transactions to the total transactions predicted as fraud.

Prcs =
Recall: This is signified as ratio of correctly predicted fraudulent transaction to the actual number of fraudulent transactions in the credit card dataset.

Rcl = V. RESULTS AND DISCUSSIONS
By applying the normalized intuitionistic fuzzy mutual information feature methods, the five important features of each dataset is selected which is then used in further steps of the proposed system extended fuzzy deep belief network enriched with sea turtle foraging algorithm (EFDBN-STFA). This work is an extension of my previous work aggrandized random forest(RF) and aggrandized kernel based SVM(AKSVM+FPSO) which detects the credit card fraud transactions, thus leading to prediction based on the behavior patterns of the user with important features.
The " Fig. 4(a)" and " Fig. 4(b)" portraits the importance of features which involves in maximizing the relevancy among features with class and minimizes redundancy within features. By applying intuitionistic fuzzy mutual information (IFMI), the figures displays top five best features of two different credit card datasets. IFMI well treats the problem of inconsistencies in determining potential features of this credit card fraudulent detection. The features with highest score are considered for prediction process, using reduced feature subset which influences the process of credit card fraudulent detection more accurately.   Fig. 5" it is observed that the performance of the proposed EFDBN-STFA produced highest accuracy rate of 96.2% for dataset1 and 95.3% for dataset 2. The other three classification models produce less accuracy because they failed to handle uncertainty in classifying credit card transactions as either genuine or fraudulent. With the knowledge of intuitionistic fuzzy mutual information is used to determine the relevancy among features with class, the redundancy is greatly reduced and the proposed EFDBN-STFA uses only the attributes with highest information score. Thus, it achieves to produce better accuracy while comparing other three models.
The " Fig. 6" illustrates the performance of the four different credit card fraud detection models based on the precision measure on two different credit card datasets. With the ability of the fine tuning their weights and biases assigned to the hidden nodes in fuzzy deep belief network, it surpasses the performance of other conventional detection models. With the enriched knowledge of sea turtle foraging algorithm this proposed model instead of assigning the weights in the random manner, they optimized the weight assignment of the deep belief network more precisely.
It is proved from the results shown in the " Fig. 7" which compares the performance of the four different credit card fraud detection models namely RF, SVM, ANN and EFDBN-STFA. The percentage of total relevant results correctly classified was done by the proposed model EFDBN-STFA. As the real-world credit card datasets cannot be handled in the crisp value to get appropriate interpretation, the fuzzy model which treats them in the form of linguistic terms using membership grade have greatly influence the process of fraud detection process. The IFMI induce the significant features which has to be used as the input is used by EFDBN for credit card fraud detection process. The merit of population-based metaheuristic searching is done by sea turtle foraging mechanism so that the weights assigned to the deep belief networks are fine-tuned based on the optimization and thus this enriched model achieves highest recall value compared to the other models.

VI. CONCLUSION
The ultimate motive of this proposed research work is to detect the credit card fraudulent transaction as it is continuous in nature. This work developed an enriched fraud detection model which handles the presence of vagueness and complexity in determine the pattern of transaction by focusing on three different dimensions. As a primary factor, significant feature of the datasets is selected to handle voluminous credit card datasets using the intuitionistic fuzzy mutual information about the features. With the reduced feature set, the fraud detection process is greatly influenced by fuzzy deep belief network, which gains the deep knowledge of the datasets and the relationship among the features using the stacked restricted Boltzmann machine. Instead of assigning the weights in a chaotic manner, with the inspiration of sea turtle foraging based optimization the weights assigned to the hidden layers are fine-tuned and thus the expected and the observed results produced an accepted outcome. The performance of EFDBNwww.ijacsa.thesai.org STFA is done on two different credit card datasets which are represented entirely in a different domain value. The results proved the empowerment of EFDBN-STFA for credit card fraudulent detection in presence of uncertainty by consequence higher detection rate of fraudulent transaction compared to the existing models. The proposed model restricts the frauds to happen while transactions and promote prevention of fraud in the future.