Customer Churn Prediction Model and Identifying Features to Increase Customer Retention based on User Generated Content

Customer churn is a problem for most companies because it affects the revenues of the company when a customer switch from a service provider company to another in the telecom sector. For solving this problem we put two main approaches: the first one is identifying the main factors that affect customers churn, the second one is detecting the customers that have a high probability to churn through analyzing social media. For the first approach we build a dataset through practical questionnaires and analyzing them by using machine learning algorithms like Deep Learning, Logistic Regression, and Naïve Bayes algorithms. The second approach is customer churn prediction model through analyzing their opinions through their user-generated content (UGC) like comments, posts, messages, and products or services' reviews. For analyzing the UGC we used Sentiment analysis for finding the text polarity (negative/positive). The results show that the used algorithms had the same accuracy but differ in arrangement of attributes according to their weights in the decision. Keywords—Customer churn; telecom sector; churn prediction; sentiment analysis; machine learning; customer retention


I. INTRODUCTION
During the enormous increase in numbers of customers who are using the communication sector and in numbers of companies [1], the competitive level between companies raised [2,3]. Each company tries to survive in this competition through many strategies [4], The Main strategies are: 1) upsell existing customer, 2) increase duration of retention of their customers, 3) acquire new customers. Companies are concerned about seeking to keep or retain their customers as they are considered that as a profit, and it is cheaper to keep them than to earn a new one. Each company tries to keep its customers, by make them more loyal. Customers are great ambassadors in the market [5] as the company can use them for making advertising of the company's product or service.
This free advertising will cost nothing except the high quality and service after the sale.
Each company should be concerned with customer churn prediction and predict customers who are likely to leave the company to preserve its revenue, but companies must do it early [6].
Many types of research assured that machine learning is efficient in prediction through learning from past situations or previous data [7,8,9].
A customer churn happens when customers are not satisfied with a service provided by a company. It results in customers switching to another service provider. Customers have different reasons for churn, and all of them should not be treated in the same way. There is a need for a prediction model to predict churn customers and provide a strategy of retention depends on their churn factors [7].
According to vast numbers of people who use social media to show their opinions, whether by text, emotion, picture or video, we use sentiment analysis to analyze and classify every comment into positive, neutral, or negative [10]. And then, we track negative comments to the customer retention department to retain the churned customer.
In this paper, we proposed two approaches for helping companies to keep their customers by identifying the top reasons for churn and predicting the customer before the churn action is taken.
The rest of this paper is organized as: Section 2 presents the literature review. The background of the technique is presented in Section 3. The proposed model for churn detection from social media is briefly explained in Section 4. Experimental results and analysis are discussed in Section 5. We conclude the proposed study in Section 6.

II. LITERATURE REVIEW
This section presents the literature related to customers' retention and its prediction methods, identifying many factors related to customer retention, and using social media to get users' opinions to enhance retention. Don Jyh-Fu Jeng, Thomas Bailey, has used hybrid, multiple criteria decision-making (MCDM) method to inspect customer retention framework and they found that the most common response was to look at pricing and customer service [11]. Ali Tamaddoni Jahromi, Stanislav Stakhovych, Michael Ewing used models for churn predictions in a B2B context, and to increase the profitability of retention campaigns; he found that boosting model, logistic regression, cost-sensitive and straightforward decision tree is applied on tests [12]. While Nitish Varshney and S.K. Gupta used social media analytics to get users' opinions through Twitter. The tweets were classified into three categories using a lexicon-based classifier and applying the association rule mining to find the dominant churn factor [13]. J. Vijaya, E. Sivasankar and S. Gayathri have proposed that ensemble classification techniques with hybrid fuzzy clustering provide more accuracy and better performance than single classifiers and clustering [14]. Amin, A., Al-Obeidat, F., Shah, B., Adnan, A., Loo, J., & Anwar, S used a distance factor in classifier decision [5]. Hossain, M.A, Chowdhury, M .R., & Jahan, N. have supported the importance of customer satisfaction in building a relationship between buyer and seller, using a model that was constrained only four constructs (price, network, customer care, and brand image) to explain customer satisfaction [15]. Adnan, Sajid, Awais, M.Nawaz, K.Alawfi, Amir, and Kaizhu have proposed a practical approach to classify, predict and extract important decision rules related to customer churn or not according to an intelligent rule-based decision-making technique, this technique based on rough set theory (RST). Experiments are carried out to evaluate the performance of RST using Exhaustive Algorithm (EA), Genetic Algorithm (GA), Covering Algorithm (CA), and the LEM2 algorithm (LA). Results show that RST based on GA is the most efficient technique for extracting implicit knowledge in the form of decision rules [16]. However, J. Vijaya and E. Sivasankar used RST with other techniques such as Bagging, Random Subspace, and Boosting. Boosting has achieved the highest accuracy of 93.73%. They found that ensemble classification techniques work better with a classification accuracy of 95.13% compared to any single model [17]. Despite Abhishek and Ratnesh have trained four machine learning models which are Logistic Regression, Support Vector Machine, Random Forest and Gradients boosted tree, and they found that Gradient boosted tree is best among other models, Both Random forest and Logistic regression are an average while SVM is underperforming between these models [18].
Most of the previous researches used classical machine learning techniques, and because of that, we tried to use the recent techniques like deep learning for identifying churn factors accurately. Also, we used the power of social medial for early churn detection of customers through using sentiment analysis that used in many areas of business analytics models.
In Section 3, we will discuss the algorithms we used in our study and sentiment analysis technique.

III. BACKGROUND OF TECHNIQUE
We have many algorithms in machine learning; we discuss only three algorithms that we used in our study and sentiment analysis technique.

A. Machine Learning Techniques-Classification Methods
Many approaches were applied to predict churn in the telecom sector; most of them have used machine learning technology and data mining. Techniques supposed for use in customer churn prediction:

a) Deep Learning Algorithm
Deep learning is a subset of machine learning based on neural networks that permit a machine to train itself to perform a task [19].

b) Naïve Bayes
Naïve Bayes classifier, also known as simple Bayes or independence Bayes, is a simple probabilistic classifier. This method builds on independence between the input variables, but it performs well even under conditions that might be considered suboptimal for algorithms [20,21].

c) Logistic Regression
Logistic Regression is a traditional machine learning algorithm developed by a statistician. It is used for classification problems as it works through predictions of the relationship between the predictor variable and the output variable [21,22].

B. Sentiment Analysis
Sentiment analysis, also known as opinion analysis, is one of the most important techniques used in social media. This technique is used to extract expressions, opinions of internet users, which are expressed in several forms (such as emotions, texts, pictures, videos), analyzing the opinion-oriented, and then classify it into positive, neutral, or negative sentiments [23,24].

A. Customer Churn Prediction Model
This paper proposed two main contributions; the first one is a model for customer Churn prediction by analyzing usergenerated content, and the second model is identifying main attributes that help the retention department to keep their customers and prevent them from the churn. Customer churn prediction model using UGC proposed in Fig. 1, the proposed model consists of multiple processes, as shown in Fig. 1; these steps are: Step 1: User creates his user-generated content; this content could be post, opinion, or comments.
Step 2: English treebank applies text preprocessing, stemming, and lemmatization on English text to extract essential words in their basic form. Step 3: Sentiment analysis classification: The extracted English words are entered for the classification process by measuring the polarity of each word then each text is classified according to its similarity with each class (positive, negative, neutral).
Step 4: Classify a user's comment, whether positive or negative; if the comment is adverse and churn probability is high, there is an alarm sent to the customer retention department with the user's id to communicate with the customer and try to retain the customer.
Step 5: The output of the proposed model is the sentimentally classified English text.
For example, I hate this company.
The model divides this sentence into four parts then classifies each word whether positive, negative, or neutral.  So, in the first example we have one-word negative and do not have any positive word, so the comment classified as a negative comment. While in the second example we have oneword positive and do not have any negative word, so the comment classified as a positive comment. Then in the last example we two positive and two negative words, the comment classified as negative. The classification happens for both the words and the sentence to identify the polarity to be more accurate.

B. Identifying Customer Churn Factors
We created a questionnaire and distributed it among customers for building a dataset that can be analyzed by machine learning algorithms. This dataset is built within different telecom companies and put almost known factors in it that later affect their decision, whether churn or not.

Description of survey's attributes
Gender: Gender of the customer. Age: age of the customer. The company of cellular communication service: name of the company which the customer-related.
For how many years a customer in the company: no. of years that customer relates to the company.
Customer's line type: line type, whether personal, business, or corporate.
Use to make a complaint: customers have used in displaying a complaint.
A problem happened with the company: mention a problem that happened.
The company tried to retain the customer: mention the response of the company.
Churn: customer ports out.
In the next section, we will present experimental results and analysis of proposed solutions.

V. EXPERIMENTAL RESULTS AND DISCUSSION
In this section, we have explored the experiments and results of the proposed study, our experiments are divided into two experiments. The first experiment analyzed users' comments from social media through sentiment analysis. The second experiment analyzed the dataset by using Naïve Bayes, Logistic Regression, and Deep learning. Then we used the correlation coefficient for finding the most effective factor with churn decision. 524 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 A. Experiments Firstly, we targeted online social media users' comments from the Vodafone UK page. Then we collected many comments and used sentiment analysis to analyze and classify their comments according to positive, negative, or neutral sentiment. We collected 352 comments over six months, the comments from both males and females, most of them within 20-35 years. Then, we analyzed our survey through three algorithms (DL, NB, LR) to find the dominant churn factor.

B. Results of the Experiment
Here are the results of both customers' comments and dataset analysis.

a) Results of Comments
There are some comments from what we collected from social media (i.e. Vodafone UK Facebook page) and they are shown in Table III.   The company tried to retain you 0.0394301 c) Result of Correlation After we experimented with the algorithms, we used correlation for each algorithm in order to calculate the weight of attributes with the label attribute (churn decision whether yes or no) to know the value with the highest weight which affects customers' decision to be more accurate.

C. Experiment Results Summary
Three main points summarize our results:

a) UGC Analysis
The result showed the sentiment of each comment; it divided each sentence into parts then classified the kind of sentence whether positive, negative, or neutral with percentage. 526 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 b) Dataset Analysis Here we can notice that the weights of attributes have differed from one algorithm to another. In Naïve Bayes, the line type of users has the highest weight while in Logistic Regression, gender has the highest weight, and in Deep Learning, the customer's age has the highest weight.
Whereas the years the customer has joined, the company has the lowest weight in Naïve Bayes. In contrast, line type of the customer has the lowest weight in the Logistic Regression, but in Deep Learning the trial of the company to retain its customer has the lowest weight.

c) Correlation Analysis
The results of correlation shown the values affect each attribute, which has a high rate on the customer's decision, whether churn or not churns from the company. We can notice that the most effective value in 'A problem happened for you with the company' attribute was 'Withdraw from the credit without using it' in Correlation Naïve Bayes, Correlation Deep Learning, and Correlation Logistic Regression, which means it has the highest weight in customers' decision. And after it came the best way to retain customers was offered minutes for free. And as we see all algorithms have the same order of values.

VI. CONCLUSION
The importance of this research paper comes from the importance of customer churn in the telecom sector. It helps companies to make more profit as customer churn is considered as one of the most important sources of income for the telecom sector. Hence, we build a model to analyze the behavior of customers and predict whom customers want to churn. In this study, we used Deep Learning, Naïve Bayes, and Logistic Regression algorithms for identifying the most important factors that affect the churning process. And we found the correlation for these algorithms to get the value of attribute which has the highest weight in the decision. Moreover, we analyzed the UGC by using sentiment analysis to analyze and classify customers' opinions. According to the survey and comments, we can track the churned customer to the customer retention department to retain him with the customer's id.

Attribute Weight
A problem happened for you with the company = Withdraw from the credit without using it 0.1673900 The company tried to retain you = Offered minutes for free 0.1556922 A problem happened for you with the company = The internet was cut 0.1321921 The company tried to retain you = Offered more internet megabits 0.1321921 A problem happened for you with the company = Problem in payment plan 0.1096181 A problem happened for you with the company = The internet was very slowly 0.1096181 The company tried to retain you = Offered free gigabits 0.1096181 The company tried to retain you = Reduce the invoice 0.1096181 The company tried to retain you = By offers 0.1068224 age = 36-45 0.1027885 The company tried to retain you = Solve the problem 0.0924705 The company tried to retain you = Return back the credit 0.0850286 The company tried to retain you = Gave me new Sim card 0.0518476 The company tried to retain you = Offered a better plan of payment 0.0518476 The company tried to retain you = Offered an internet modem for free 0.0518476 A problem happened for you with the company = Way of payment for internet service 0.0364868 The company tried to retain you = Come back my flex 0.0364868 The company tried to retain you = Corrected the invoice 0.0364868 The company tried to retain you = I didn't mention the problem 0.0364868 The company tried to retain you = Made a discount on the new package and return the amount 0.0364868 The company tried to retain you = Not enough time 0.0364868 The company tried to retain you = Regular maintenance 0.0364868 The company tried to retain you = Went to another store 0.0364868 The company tried to retain you = Gave me new Sim card 0.051848 The company tried to retain you = Offered a better plan of payment 0.051848 The company tried to retain you = Offered an internet modem for free 0.051848 A problem happened for you with the company = Way of payment for internet service 0.036487 The company tried to retain you = Come back my flex 0.036487 The company tried to retain you = Corrected the invoice 0.036487 The company tried to retain you = I didn't mention the problem 0.036487 The company tried to retain you = Made a discount on the new package and return the amount 0.036487 The company tried to retain you = Not enough time 0.036487 The company tried to retain you = Regular maintenance 0.036487 The company tried to retain you = Went to another store 0.036487 The company tried to retain you = Justified the reason and apologized 0.06381 The company tried to retain you = Offered a free bundle 0.06381 The company tried to retain you = Offered to pay only half of my bill for 6 month 0.06381 A problem happened for you with the company = Errors in the invoice 0.05247 A problem happened for you with the company = Extra money charged 0.05185 A problem happened for you with the company = Problem in the Sim card 0.05185 A problem happened for you with the company = The package was ended without inform me 0.05185 The company tried to retain you = Brought my money back 0.05185 The company tried to retain you = Gave me new Sim card 0.05185 The company tried to retain you = Offered a better plan of payment 0.05185 The company tried to retain you = Offered an internet modem for free 0.05185 A problem happened for you with the company = Way of payment for internet service 0.03649 The company tried to retain you = Come back my flex 0.03649 The company tried to retain you = Corrected the invoice 0.03649 The company tried to retain you = I didn't mention the problem 0.03649 The company tried to retain you = Made a discount on the new package and return the amount 0.03649 The company tried to retain you = Not enough time 0.03649 The company tried to retain you = Regular maintenance 0.03649 The company tried to retain you = Went to another store 0.03649 Your line type = Business 0.03288 use to make complaint = Customer Service Representative 0.02800 The company tried to retain you = Nothing happened 0.01251 Your line type = Personal 0.01071 A problem happened for you with the company = Problem in mobile data 0.01071 For how many years are you a customer of the company = From 5 to 10 years 0.01046 Gender = Female 0.00654