An Optimized Kernel MSVM Machine Learning-based Model for Churn Analysis

—Customer churn is considered as a significant issue in any industry due to various services, clients, and commodities. A massive amount of data is being created from e-commerce services and tools. Analytical data and machine learning-based approaches have been implemented and utilized for CA (churn analysis) to design a plan, i.e., required to comprehend the rationale for the CC (Customer Churn) and to generate a profitable and actual customer holding program. The analytics and machine learning approaches mainly focus on customer profiling, CC classification, and detection of features that affect churn. However, there are no specific techniques which can be used to determine how often a prospective customer is inclined to cover all the expenses whether they are churned or not. In this paper, an Optimized Kernel MSVM classification model is proposed to predict and classify churn. In the proposed work, MSVM algorithm has been used for classification. The kernel PCA and ALO optimizer method has been used for Feature extraction and selection. The proposed model Optimized Kernel MSVM has been implemented on Tele-communication sector customer churn database to demonstrate the proposed model's generalization ability. The Optimized Kernel MSVM model has achieved an accuracy of 91.05%, AUC 85% being maximum and reduced the RMSE score to 2.838. The implementation shows that both churn detection and classification may be examined at the same time while maintaining the highest overall accuracy and AUC.


I. INTRODUCTION
Identifying causes for client loss, evaluating customer retention, and recovering clients have become essential ideas for several businesses. Organizations do many studies and efforts to prevent losing clients in support of acquiring new clients. Due to the increasing green technologies, growing users, and valuation solutions. The unregulated and rapid development of this area has resulted in increased failure due to deception and technological challenges. As a result, creating new methodological approaches has become a necessity. This issue has become a pioneer in numerous studies in India's communications industry, experiencing massive consumer shortages. These are some of the uses of information retrieval: churn analysis, which is broadly employed across all sectors. Organizations need to develop initiatives to enhance customer satisfaction and formulate plans for improved customer retention by assessing which customers are likely to transfer providers. This research aims to figure out how a telecommunication business loses consumers.
In the same way that causes are examined, determining which kinds of clients are abandoned is also studied [1]. There are different ways to define churn. The most important two ways are contractual churn and non-contractual churn. When a person does not extend their contract after the termination period has passed, this is known as contractual churn. Such churn starts when the customer loses interest in the products and comes to a point where reintegration is no longer feasible [2]. It's most commonly found in churn issues that occur whenever users terminate their savings accounts or move their wireless provider from one provider. Non-contractual churn is the second condition. Users can generally leave service without specific timelines in a non-contractual scenario. The customer operations team first establishes a churning state, after which a customer who meets that criterion is identified as a churned customer. The user's behavioral modification time is used to accomplish it. The individual is considered a churn client whenever the duration of idleness or changing behavior surpasses the limit.
The period is the interval established as the boundary of the inactive period throughout this operation [3]. There usually are two stages of churn prediction: (i) selection of features to evaluate the subsections of the characteristics that distinguish users would be preferable to locate the churned or not, and (ii) churn forecast. Customer churn is a helpful way to track how many clients are lost. Telecommunications firms frequently lose valued consumers and, as a result, revenue to competitors. The telecommunications business has seen significant transformations in recent decades, including the growth of different products, scientific innovations, and more competitiveness. Therefore, customer churn forecasting throughout the telecommunication sector has grown critical of players in the industry necessary to defend existing customer loyalty, maintain their competitive advantage, and enhance relationships with the customers [5]. Among the most complex difficulties in the telecommunication sector is supporting consumers with an elevated unemployment probability. Consumers usually choose churn choices due to the increased number of telecommunications companies and competitive pressures. As a result, telecommunications companies have realized the value of customer retention rather than obtaining new customers [16]. A variety of reasons influences customer churn. Prepaid clients, unlike post-paid subscribers, really aren't constrained by contractual arrangements; therefore, they frequently churn for the most little causes. As a result, predicting the customer churn rate is challenging [27]. The additional aspect is client loyalty, which is influenced by the service providers' service and product performance. Customers may switch to a competitor with a more extensive range and higher transmission qualities due to broadband service and transmission reliability problems [17] [18]. Inadequate or unsatisfactory response to concerns and invoicing issues are other variables that increase clients' likelihood of emigrating to the opposition. Clients may transfer to the competitors based on shipping expenses, insufficient functionality, and obsolete equipment. Customers frequently evaluate suppliers and switch to whichever they believe offers a significantly better price. Even if it involves obtaining no potential clients, a telecommunications business can do OK, providing better care to established customers. The typical churn rate amongst telecommunication companies in the telecommunications business is around two per cent globally, resulting in a massive yearly loss of around $1 trillion.
The research motivation to acquire the issue mentioned above, the corporation must accurately forecast the customers' behavior. There are two strategies for managing the churn rate: There are two types of reactions: (a) reactive and (b) proactive. In the reactive strategy, the corporation prepares for a reschedule from the user, following which it provides the appealing client decisions to maintain them. The likelihood of churning is predictable under the preventative approach, and clients present alternatives appropriately. It's a supervised learning [15] model in which churners and non-churners are differentiated. Machine learning, which comprises regression analysis, vector support network, RF, NB, LR, and others, has proven to be an extremely effective method for predicting data on the performance of formerly gathered information to address this challenge. Pre-processing and feature selection play an essential part in improving classification accuracy in machine learning techniques. Researchers have devised a slew of feature selection methods that can help minimize dimensionality, computational time, and overfitting. The supplied input sequence identifies the features relevant for predicting churn [4]. This research aims to group different consumers and identify the elements contributing to every category's turnover.
Furthermore, this study aims to design a churn prediction model using the OKMSVM model and apply it to predict churning consumers. This study will evaluate the information presented to establish that specific variables are associated with churning prediction. The number of variables gathered will be examined, and the system will be improved for the finished product. This study also seeks to determine churn prediction expense by determining the overall cost of clients who have churned to date and how much revenue can be avoided unless we can enhance our customer defection monitoring. Following the customer churn, a group will determine how maintenance strategies will be provided.
The organization of this work is as follows: The existing techniques of churn analysis are measured in section 2. In section 3, the existing issue and the problem is given. Sects 4 and 5 define the proposed work with a flow chart and result in an analysis with a detailed explanation. Sect 6 shows the conclusion and further work in the churn analysis.

II. RELATED WORK
In this section, existing methods for churn prediction are analyzed, and comparison tables are shown for better analysis. Jain et al. (2021) [6] proposed a multi-attribute strategic planning system combined with machine learning techniques. The name of the suggested strategy was the Worker's churn forecasting and retaining approach. A two-stage methodology was used for categorizing employees by creating an incredible achievement employment significance paradigm. The first proposal of the suggested methodology was to enhance the implementation of the entropy-based approach to allocating weighting factors to personnel achievements. Furthermore, for assessing the value of the personnel accomplishing and their class-based classification, an enhanced methodology (CatBoost) was implemented. The CatBoost method was then used to forecast employee churn by classification. Ultimately, based on the forecast findings and attribute rating, the authors had presented a retention strategy. Sarac, F., et al. (2021) [7] designed a two-level churn approach to evaluate whether a client would churn and determine how the consumer would pay for services. A classification technique called support vector machine was employed for such categorization component, and a recurring monthly cost was forecasted using machine learning-based support vector regression methodology. An autonomous feature selection technique, the multi-cluster attribute selection approach, was used to pick the most relevant features, including both evaluations. For uniformity, the same attribute selection strategy was employed in both evaluations to determine its effectiveness. The proposed scheme technique was then tested mostly on IBM. Telco Customer Churn set of data, which included over 7000 clients, to validate its relevance and generalization capabilities. Lalwani, et al., (2021) [8] proposed a machine learning-based approach. There are six steps to the proposed approach. Data pre-processing was the first step, and feature analysis was carried out during the second step. The third step used gravitation methodology to evaluate essential feature selection. The input was separated into two sections: training data and testing data, with an 80/20 proportion. On the training data, one of the most common estimation methods, such as LR (Logistic Regression, SVM (Support Vector Machine), Decision Tree (DT), etc., were implemented. The boosters, as well as ensemble approaches, were used for efficient predictive performance. Furthermore, K-fold crossvalidation was performed over the training data for hyper parameter to minimize modeling fitting problems. Lastly, the confusion matrix and AUC curve were used to examine the test dataset outcomes. The Adaboost classification model was reported with 81.71 percent accuracy. Bayrak, A. T., et al., (2020) [9] proposed a churn estimation model with the help of an advanced learning technique. The designed model was based on LSTM (long short-term memory) technique. Clients' information was organized in a particular sequence in the customer information architecture. A long short-term memory design was produced using sequencing information to determine users' churn phases and therefore was compared with the existing categorization approaches. Including the assumptions, the suggested model achieved success and differentiated from the related research. Jain et al., (2020) [10] designed a framework for determining consumer attrition; the www.ijacsa.thesai.org proposed methodology was used two machine-learning methodologies: logistic boost and logistic regression. The testing was performed using the WEKA ML (machine learning) technology and an actual dataset from the Orange firm in the United States. Various assessment methods were used to display the results. Ullah, et al., (2019) [11] proposed study revealed churn characteristics, which were crucial in deciding churn's core origins. CRM might promote efficiency, offer suitable offers to talented churn clients associated with particular behavioral patterns, and drastically improve the corporation's advertising campaigns by identifying the main churn drivers from user information. The Receiving operating characteristic area, recall, Accuracy, Precision, and f-measure of the suggested churn estimation method were all examined. The findings demonstrate that by employing the R.F. method and k-means cluster formation, the suggested churn proposed method had obtained significant churn categorization and customer preferences. Alboukaey, et al., (2020) [19] suggested a regular churn forecasting-based model rather than quarterly churn forecasting, dependent on the user's regularly dynamic characteristics rather than his quarterly behavior. The authors expressed the everyday behavior of customers as multidimensional data and suggested four predictions based on the description to forecast the everyday turnover of the customers. A deep learning framework was suggested by Seymen, et al., (2020) [20] to determine if commercial consumers would churn in the later. The framework was validated against regression analysis and convolutional neural network approaches, both of which were usually applied in churn estimation analyses. Recall, A.U.C., and Precision evaluated the algorithms' outcomes with reliability classifiers. The analysis indicates that the trained model outperformed the other approaches in terms of prediction and classification. Hu, X., et al., (2020) [21] designed a machine learning-based framework based on an integrated approach. The machine learning-based neural network and decision tree were used in the integrated approach. This work develops a composite churn prediction statistical model and tests its performance using statistical results. Ahmed, et al., (2019) [22] proposed a telecom churn prediction approach that integrates ensembles layering and uplifting-based techniques. Traditional performances and expense (cost) heuristics were used in the assessments, and expense heuristics received the most attention. The proposed methodology, operations had a high level of connection across performance metrics and business objectives, making the method suited for the majority of costsensitive operations. Yu, et al., (2018) [23] designed a methodology based on the machine learning technique and a particle categorization performance tuning back propagation (B.P.) networking for telephony customer churn estimation was proposed that periodically performs P.  [26] used the machine learning technique to anticipate when consumers were poised to churn and while churning was examined. Such estimations were then applied to search through unstructured or semistructured user input log files for explanations of why and how the user could be churning. In Table I, various existing methods for churn prediction are compared. The comparison is based on implementing methods, comparison techniques, and existing issues.
The existing methods of churn analysis with proposed parameters, comparison parameters are depicted in Table II. The conclusion of the current churn analysis with future enhancement is also described in Table II.

III. PROBLEM STATEMENT
Conventional customer churn estimation is based on corporate administrators' perspective that is being used inductive approach, so executives may anticipate turnover for existing clients relies on churned users' attributes.
Even so, expertise may be uncertain; specifically in the context of a problematic issue, no helpful instruction can also be provided solely based on expertise; because of a company's limited resources, finances should be decided to invest first in recapturing those clients with the most acceptable churn probability.
The traditional system of predicting that customers are willing to churn and which clients are much less inclined to turnover is ineffective.
As a consequence, if a company wants to make a logical forecast of churn prediction, it must employ numerical algorithms as well as "machines" to detect the connection among statistical features as well as customer churn, determine if clients are being churned, and will provide the churn probability [12].
In recent years, churn prediction has been a significant concern in the telecommunication sector. Telecom carriers must identify such clients before their churn to address this issue. As a result, creating a different classification that accurately predicts churn is critical. This classification should determine clients who are likely to churn in the coming years, allowing the users to respond quickly with relevant deals and discounts.
Machine learning techniques for categorization, such as knearest neighbour, decision trees, logistical regression, neural networks, Nave Bayes, and others, are the most popular approaches for this objective. In addition, studies should concentrate on uncovering innovative capabilities that are the most successful in forecasting client attrition [13]. Various costs can cause customer churn, individual characteristics, information and service, facilitating conditions, economic indicators, promotional strategies, and competition' market participation. The best approach to recovering churned clients and lowering the churn rate is determining churn prediction's cause(s).
According to a brief overview of influential factors of churn prediction in recent times, academics' investigation on attempting to influence determinants of consumer churn in the telecommunications industry consists of three main components: 1st, usage factors like call period and usage portion, accompanied by quantitative client variables like personally identifiable information as well as maturity level, client revenue, and satisfaction of customers, as well as eventually, corporation relevant factors. www.ijacsa.thesai.org IV. RESEARCH METHODOLOGY In this paper, a new model is proposed to analyze the churn. Fig. 1 shows various modules of the proposed model to perform churn analysis.
The initial input has different data elements collected from the dataset and considered labelled experiences for the machine learning model. These experiences are labelled data sources and help to create unique patterns for Churn analysis. The modules present in proposed model process the dataset and give the ability to model for further predictions.
The first phase of the proposed model is to pre-process the uploaded dataset. This process helps to clean the input data, remove all unwanted elements, remove or replace the String elements, remove non-processing data entries, etc.
The pre-processed data extracted the features and built unique feature patterns for training categories. The feature patterns combine numbers in the M*N matrix that form a unique combination against a particular case.
The feature extraction process is handled by the KPCA algorithm in the proposed architecture. KPCA algorithm is used to extract the features in a matrix. It reduces the dimensionality of data without much loss of data. It applied to the dataset that is linearly separable as compared to other methods.
KPCA utilizes a KF (kernel function) to project the database into a high dimensional feature space, linearly separable. Extracted features are used to process with the Ant Lion optimizer. ALO is an optimization module of the proposed architecture that helps reduce the feature set's error probability. The more miniature error probability training set provides more accuracy in the trained model. It is an iterative process to find the best cost solutions for input feature patterns.
Once all the cases are processed, the data element is processed with the initialization module of the prediction model. This module processes the optimized data elements and builds various subsets of the actual datasets. These subsets are used to train, validate and test the prediction model.
The training module in this phase is processed with the training subset and labels for those patterns. It introduces the MSVM model and stores it in the secondary storage device to load and perform test phases.
Testing the Churn analysis process predicts the possibilities on the given input dataset. The test set loads the test subset and the training module and then proceeds with the prediction method of MSVM. This method is a promising approach for predicting online datasets because MSVMs use a risk-minimization rule that contains the error.
After the prediction module, various performance parameters are used to calculate the performance of the proposed architecture. The computed performance metrics are used to build the comparison sets and validate the contribution of this enhancement.

V. SIMULATION RESULT ANALYSIS
This section describes the detailed dataset description, simulation tool, and performance metrics such as accuracy rate, AUC, etc.

A. Dataset Description
This proposed work has used "WA_Fn-UseC-Telco-Customer-Churn" dataset [14]. This proposed database defines the raw information consisting 7043 rows ("users") and 21 columns (properties). Throughout the regression and categorization operations, characteristics 21 are employed as the target position. The churn columns are our target values.

B. Arithmetic Formula's
For the classification job with Optimized kernel MSVM, three parameters are utilized to measure the evaluation of the CCs (Churn Classification). They are AUC, Accuracy Rate (%), and RMSE rate. The area under the curve may be understood as an aggregate amount of classification evaluation to complete all expected categorization techniques. The accuracy rate may be expressed as; (1) For the prediction and classification method with optimized kernel MSVM algorithm, RMSE (Root Mean Square Error Rate) is used to measure the calculation of the monthly charge. It may be expressed as; For the classification method with optimized kernel MSVM algorithm, the accuracy rate is the ratio of the correctly labeled churn values to the complete dataset. It may be defined as; (3) Here, tn = True Negative, fp = False Positive, tp = True Positive, fn = False Negative, and n1 represents the number of all samples in the tele-comm dataset, y1 i and y"1 i are the intended and expected values, respectively. The database is divided into two modules to individually assess the approaches, such as testing and training modules. The training program is employed to the training purpose of the models, while the testing section is used to evaluate the models' efficiency. A 10-fold cross-validation methodology is acceptable including both CC (churn categorization) and forecasting activities to successfully perform the analysis. It indicates that the information is divided into subgroups with an equivalent amount of data in each. The residual set is also used to evaluate the strategy whereas alternative subgroups are used for training. This method is continued till all the subgroups have been analyzed separately and the training phase has been completed.

C. Implementation Results
This section describes the results of the churn analysis, and the research work has designed a script using the PYTHON language. This research work has worked on two different modules as the train and test module. The training module is created using the PYTHON language and is made. It will make the user interface for the man-to-machine interactions that the customer can easily click and see the output. The research will predict the test-set classes using an optimized kernel MSVM trained model and evaluate the accuracy rate, and AUC score shown in Fig. 2. www.ijacsa.thesai.org Generally, the classification and regression methods will perform better than the existing model. As the model utilized will be able to learn designs from the data, the proposed parameters are defined in Table III and shown in Fig. 2, 3  Initially, the proposed model will use train samples to construct a telecommunication sector customer churn analysis model with kernel PCA (principal component analysis) feature extraction. Table III defines the proposed parameters of various churn prediction methods. So, the proposed work will evaluate FE (feature extraction) using the KPCA model on all data to fetch the extract the feature values if train samples and test information, and now utilize the train samples which evaluate the optimized the dimensionality feature sets to reconstruct an OKMSVM (optimized kernel MSVM) model.

VI. CONCLUSION AND FUTURE WORK
This paper proposed a hybrid classification model named OKMVM model. The proposed model is based on a system for CCA (customer churn analysis) of the telecommunication sector. To find high-rate mathematical information and implement it, a feature extraction process is handled by the KPCA algorithm in the proposed architecture. Extracted parts are used to process with the Ant Lion optimizer. ALO is an optimization module of the proposed architecture that helps reduce the feature set's error probability. The more miniature error probability training set provides more accuracy in the trained model. It is an iterative process to find the best cost solutions for input feature patterns. Once all the cases are processed, the data element is processed with the initialization module of the prediction model. This module processes the optimized data elements and builds various subsets of the actual datasets. These subsets are used to train, validate and test the prediction model. The training module in this phase is processed with the training subset and labels for those patterns. It introduces the MSVM model and stores it in the secondary storage device to load and perform test phases. The test set loads the test subset and the training module and then proceeds with the prediction method of MSVM. The classification outcomes are maximum accuracy rate, AUC, and optimized root means square rate compared with existing models. In future, the researchers can design novel ML-based and SC (soft computing) techniques that are more reliable for enhancing performance metrics. Furthermore, a forecasting framework will be developed to determine individuals likely to churn. Approaches like regression analysis, decision trees, and neural networks will create multiple algorithms to create forecasting models. Also, generalization ability can be specified using various FS (feature selection) and classification techniques and their uses over other databases.