Application of K-means Algorithm for Efficient Customer Segmentation: a Strategy for Targeted Customer Services

—The emergence of many business competitors has engendered severe rivalries among competing businesses in gaining new customers and retaining old ones. Due to the preceding, the need for exceptional customer services becomes pertinent, notwithstanding the size of the business. Furthermore, the ability of any business to understand each of its customers' needs will earn it greater leverage in providing targeted customer services and developing customised marketing programs for the customers. This understanding can be possible through systematic customer segmentation. Each segment comprises customers who share similar market characteristics. The ideas of Big data and machine learning have fuelled a terrific adoption of an automated approach to customer segmentation in preference to traditional market analyses that are often inefficient especially when the number of customers is too large. In this paper, the k-Means clustering algorithm is applied for this purpose. A MATLAB program of the k-Means algorithm was developed (available in the appendix) and the program is trained using a z-score normalised two-feature dataset of 100 training patterns acquired from a retail business. The features are the average amount of goods purchased by customer per month and the average number of customer visits per month. From the dataset, four customer clusters or segments were identified with 95%


I. INTRODUCTION
Over the years, the increase in competition amongst businesses and the availability of large historical data repositories have prompted the widespread applications of data mining techniques in uncovering valuable and strategic information buried in organisations' databases.Data mining is the process of extracting meaningful information from a dataset and presenting it in a human understandable format for the purpose of decision support.The data mining techniques intersect areas such as statistics, artificial intelligence, machine learning and database systems.The applications of data mining include but not limited to bioinformatics, weather forecasting, fraud detection, financial analysis and customer segmentation.The thrust of this paper is to identify customer segments in a retail business using a data mining approach.Customer segmentation is the subdivision of a business customer base into groups called customer segments such that each customer segment consists of customers who share similar market characteristics.This segmentation is based on factors that can directly or indirectly influence market or business such as products preferences or expectations, locations, behaviours and so on.The importance of customer segmentation include, inter alia, the ability of a business to customise market programs that will be suitable for each of its customer segments; business decision support in terms of risky situation such as credit relationship with its customers; identification of products associated with each segments and how to manage the forces of demand and supply; unravelling some latent dependencies and associations amongst customers, amongst products, or between customers and products which the business may not be aware of; ability to predict customer defection, and which customers are most likely to defect; and raising further market research questions as well as providing directions to finding the solutions.
Clustering has proven efficient in discovering subtle but tactical patterns or relationships buried within a repository of unlabelled datasets.This form of learning is classified under unsupervised learning.Clustering algorithms include k-Means algorithm, k-Nearest Neighbour algorithm, Self-Organising Map (SOM) and so on.These algorithms, without any knowledge of the dataset beforehand, are capable of identifying clusters therein by repeated comparisons of the input patterns until the stable clusters in the training examples are achieved based on the clustering criterion or criteria.Each cluster contains data points that have very close similarities but differ considerably from data points of other clusters.Clustering has got immense applications in pattern recognition, image analysis, bioinformatics and so on.In this paper, the k-Means clustering algorithm has been applied in customer segmentation.A MATLAB program (Appendix) of the k-Means algorithm was developed, and the training was realised using z-score normalised two-feature dataset of 100 training patterns acquired from a retail business.After several iterations, four stable clusters or customer segments were identified.The two features considered in the clustering are the average amount of goods purchased by customer per month and the average number of customer visits per month.From the dataset, four customer clusters or segments were identified and labelled thus: High-Buyers-Regular-Visitors (HBRV), High-Buyers-Irregular-Visitors (HBIV), Low-Buyers-Regular-Visitors (LBRV) and Low-Buyers-Irregular-Visitors (LBIV).Furthermore, for any input pattern that was www.ijarai.thesai.orgnot in the training set, its cluster can be correctly extrapolated by normalising it and computing its similarities from the cluster centroids associated with each of the clusters.It will hence be assigned to any of clusters with which it has the closest similarity.

A. Customer Segmentation
Over the years, the commercial world is becoming more competitive, as such organizations have to satisfy the needs and wants of their customers, attract new customers, and hence enhance their businesses [1].The task of identifying and satisfying the needs and wants of each customer in a business is a very complex task.This is because customers may be different in their needs, wants, demography, geography, tastes and preferences, behaviours and so on.As such, it is a wrong practice to treat all the customers equally in business.This challenge has motivated the adoption of the idea of customer segmentation or market segmentation, in which the customers are subdivided into smaller groups or segments wherein members of each segment show similar market behaviours or characteristics.According to [2], customer segmentation is a strategy of dividing the market into homogenous groups.[3] posits that -the purpose of segmentation is the concentration of marketing energy and force on subdivision (or market segment) to gain a competitive advantage within the segment.It's analogous to the military principle of concentration of force to overwhelm energy.‖Customer or Market segmentation includes geographic segmentation, demographic segmentation, media segmentation, price segmentation, psychographic or lifestyle segmentation, distribution segmentation and time segmentation [3].

B. Big Data
Recently, research in Big data has gained momentum.[4] defines Big data asthe word describing the large volume of both structured and unstructured data, which cannot be analyzed using traditional techniques and algorithm.‖According to [5], -the amount of data in our world has been exploding.Companies capture trillions of bytes of information about their customers, suppliers, and operations, and millions of networked sensors are being embedded in the physical world in devices such as mobile phones and automobiles, sensing, creating, and communicating data.‖Big data has demonstrated the capacity to improve predictions, save money, boost efficiency and enhance decision-making in fields as disparate as traffic control, weather forecasting, disaster prevention, finance, fraud control, business transaction, national security, education, and health care [6].Big data is mainly characterised by three V's namely: volume, variety and velocity.There are other 2V's available -veracity and value, thus making it 5V's [4].Volume refers to the vast amount of data in Zettabytes or Brontobytes being generated per minute; velocity refers to speed at which new data is created or the speed at which existing data moves around; variety refers to different types of data; veracity describes the degree of messiness or trustworthiness of data; and value refers to the worth of information that can be mined from data.The last V, value is what makes Big data and data mining interesting to businesses and organisations.

C. Clustering and k-Means Algorithm
According to [7], clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters).[8] opined that clustering algorithms generate clusters having similarity between data objects based on some characteristics.Clustering is extensively used in many areas such as pattern recognition, computer science, medical, machine learning.[6] states that -formally cluster structure is represented as a set of subset C=C1,……..Ck of S, such that S= Ci and Cj= for .Consequently, instances in S belong to exactly one and only one subset‖.Clustering algorithms have been classified into hierarchical and partitional clustering algorithms.Hierarchical clustering algorithms create clusters based on some hierarchies.It is based on the idea of objects being more related to nearby objects farther away [6].It can be top-down or bottom-up hierarchical clustering.The top-down approach is referred to as divisive while the bottom-up approach is known as agglomerative.The partitional clustering algorithms create various partitions and then evaluate them by some criterion.k-Means algorithm is one of most popular partitional clustering algorithm [4].It is a centroid-based algorithm in which each data point is placed in exactly one of the K nonoverlapping clusters selected before the algorithm is run.where, is the centroid or mean of data points in cluster .

III. METHODOLOGY
The data used in this paper was collected from a mega retail business outfit that has many branches in Akwa Ibom state, Nigeria.The dataset consists of 2 attributes and 100 tuples, representing 100 selected customers.The two attributes include average amount of goods purchased by customer per month and average number of customer visits per month.In this paper, four steps were adopted in realising an accurate result.They include feature normalisation alongside centroids initialisation step, assignment step and updating step, which are the three major generic steps in the k-Means algorithms.where, is the normalised value of x in feature vector f, is the meant of the feature vector f, and is the standard deviation of feature vector f.

B. Centroids Initialisation
The initial centroids or means were chosen.Figure 1 presents the initialisation of the cluster centres.Four cluster centres shown in different shapes were selected using Forgy method.In Forgy method of initialisation k (in this case k=4) data points are randomly selected as the cluster centroids.

C. Assignment Stage
In the assignment stage, each data point is assigned to the cluster whose centroid yields the least within cluster sum of squares compared with other clusters.That is, the square Euclidean norms of each data point from the current centroids are computed.Thereafter, the data points are assigned membership of the cluster that gives the minimum square Euclidean norm.This has been mathematically explained in equation ( 3) (3) where each data point is assigned to only one cluster or set at the iteration t.

D. Updating Stage
After each iteration, new centroid is computed for each cluster as the mean of all the data points present in the cluster as shown in equation ( 4) where, is the updated centroid.
Fig. 2 presents the positions of the centroids and the updated assignment of their cluster members after the 30 th iteration.The each cluster members assume the same shapes as their cluster centroid.Table II shows the changes in the cluster centroids from the initialisation stage (0 th iteration) to the 5 th iteration.IV.RESULTS AND DISCUSSION The k-Means clustering algorithm converged after 100 iterations.That is, the cluster centroids became stable.Figure 3 shows the graph of the converged data points and centroids.After this, the k-Means algorithm was able to cluster almost the entire data points correctly.The centroids or the cluster vectors after convergence are: Each of the clusters represents a customer segment.From Figure 3, the data points at the right hand top corner represent HBRV; the data points left hand top corner represent the HBIV; the data points at the right hand lower corner represent LBRV; while those at the left hand lower corner represent the LBIV.This is clearly shown in Table II.V. PERFORMANCE EVALUATION Purity measure was used to measure the extent to which a cluster contains of class of data points.The purity of each cluster is computed with equation ( 5).The total purity of the whole clustering i.e. considering all the clusters is given by equation (6).
Where, D is the total number of data points being classified.
The confusion matrix is presented in Table III.Since, = 0.95(from row 6, column 6 of Table 3), the clustering algorithm was 95% accurate in performing the customers segmentation.

VI. CONCLUSIONS
This paper has presented a MATLAB implementation of the k-Means clustering algorithm for customer segmentation based on data collected from a mega business retail outfit that has many branches in Akwa Ibom state, Nigeria.The algorithm has a purity measure of 0.95 indicating 95% accurate segmentation of the customers.Insight into the business's customer segmentation will avail it with the following advantages: the ability of the business to customise market programs that will be suitable for each of its customer segments; business decision support in terms of risky situations such as credit relationship with its customers; identification of products associated with each segments and how to manage the forces of demand and supply; unravelling some latent dependencies and associations amongst customers, amongst products, or between customers and products which the business may not be aware of; ability to predict customer defection and which customers are most likely to defect; and raising further market research questions as well as providing directions to finding the solutions.
The k-Means algorithm works thus: given a set of ddimensional training input vectors { x 1 , x 2 ,.., x n }, the k-Means clustering algorithm partitions the n training examples into k sets of data points or clusters S = {S 1 , S 2 , …, S k }, where k n, such that the within cluster sum of squares is minimised.

S
www.ijarai.thesai.orgA.Feature normalisationThis is a data preparation stage.Feature normalisation helps to adjust all the data elements to a common scale in order to improve the performance of the clustering algorithm.Each data point is converted to the range of -2 to +2.Normalisation techniques include Min-max, decimal scaling and z-score.The z-score normalisation technique was used to normalise the features before running the k-Means algorithm on the dataset.Equation (2) gives the formulae for normalisation using the z-score technique.=(2)

Fig. 1 .
Fig. 1.The initialization stage of k-Means algorithm

Fig. 2 .
Fig. 2. Positions of the centroids and their cluster members after the 30 th iteration

TABLE I .
INITIALISATION AND UPDATING OF THE CLUSTER VECTORS OR CENTROIDS)

TABLE II .
DESCRIPTION OF EACH CLUSTER IN TERMS OF THE CUSTOMER SEGMENT

TABLE III .
CONFUSION MATRIX