Self-Organizing Map based Wallboards to Interpret Sudden Call Hikes in Contact Centers

In a contact center, it is required to foresee and excavate any disturbance to the daily experiencing call pattern. Abnormal call pattern may be a result of a sudden change in the organization’s external world. Expecting a methodological analysis prior to meet customers’ demand may introduce a delay for queuing customers. It is required a fast and promising method to predict and reasoning any unwilling event. It is not possible to draw conclusions by considering one dimension such as total call count. Total call count may increase in same way due to a failure in any service. Research mainly focusses on reasoning multidimensional events based on historical records. In contrast to traditional wallboards, our approach is capable of clustering and predicting disturbances to the normal call patterns based on historical knowledge by considering many dimensions such as queue statistics of many service queues. Our approach showed improved results over traditional wallboards equipped with 2D or 3D graphs. Keywords—Multidimensional data; visualization; contact centers; self-organizing map; clustering


I. INTRODUCTION
Contact center is a very sensitive customer touch point within an organization [1]. In a different explanation, any sudden incident in the external world of an organization will be reflected immediately via contact centers. Contact center is a combination of both technology and human who are waiting to provide information required by the callers. Neither customer care officer nor a customer will be able to conclude a sudden change in the outside world of an organization. Simply, incident can be exampled as an electricity failure of a region for national electricity provider or common telephone cable damage due to a heavy rain for national telephone provider. In both cases, customers may unhappy about the service interruption and they will try to complain about it by resulting a huge demand upon the relevant contact centers.
It will create a need for a software mediator which could extract information from both caller and customer care office, analyze against historical knowledge and present to the supervisors who can decide any deviation from normal operational procedures. Such an operational deviation is mandatory in a sudden event because contact centers were not designed to absorb abnormal conditions to associated capital investments. Financially benefited strategy is maintaining a separate approach for sudden incidents.
To have a competitive advantage over the competitors who are in the same field, it is required a good insight about when to switch strategies without diluting customer satisfaction. Our research focusses on introducing wallboards with capability to visualize multidimensional data along with clustering to support real time decision making with the support of historical knowledge.

II. RELATED WORK
Due to increasingly generating and accumulating large amount of unstructured data, it is often tending to excavating knowledge behind the data. While analyzers interested about user-friendly interactive systems for knowledge extraction, managers like people who are working in a contact center or help desk are more interested about wallboard [2] type information systems to alert timely any deviation with related to their predictions or organizational directions. This para will motivate the trending research interest about visualizing dynamic data. Galkin and others have examined a pipeline approach to visualize and analyze dynamic data to facilitate anomaly detection, clustering, trend analysis and variation analysis [3]. Their research explained that analyzing complex multidimensional data will introduce a deep insight to data science and related industries. This can be accomplished by using the influence of trending technologies such as artificial intelligence, machine learning and neural networks [4], [5]. In the past, several research interests have been showed in the area of time dependent multidimensional data visualization [6], [7] and one of the common observation in the above mentioned area is seeking the assistance of a domain expert to interpret the result which was projected to a 2D or 3D coordinate system. In Steven's approach, they are using a progressive technique which rendering the incoming data progressively. In other words, framework will initially construct a geometry to represent a past summery and updating a dynamic scalar field on top of that geometry [8]. In their research, they have tried to address a challenge which is handling and representing dynamic scalar in an efficient and user friendly manner. Mashima and other have presented a map based metaphor for visualizing dynamic data [9], [10]. Point (peg) in a map represents a multidimensional data while distances from the other points on the map are proportional to the similarity between points. In the other words, higher the distance mean less similar points. Researchers suggested that animations can be used as user interactions to successfully present dynamic scalars in a view. Wallboard kind of applications can be equipped with periodic updates or animations to successfully project multidimensional data with dynamic behavior to a 2D screen.
Throughout the history data visualization was a trending topic due to need of knowing "what is it hiding". People tried to interpret hidden knowledge in a human understandable way 208 | P a g e www.ijacsa.thesai.org [11]. A better visual representation is easily understandable and holds features like effectiveness, accuracy, robustness, easy to use, etc. [12], [13]. Historical effort for visualizing multidimensional data can be exampled with techniques like Scatter Plot matrix. With increase of velocity, volume and veracity of data collections, it opened new research areas like virtual reality for data visualization, augmented reality for data visualization, effective use of interaction methods for data visualization, etc. [14]- [16]. Wallboard is a display placed in a public area to convey any interested information. It can be equipped with effective interaction techniques.
Another important branch of visualization is dimensionality reduction techniques such as Principal Component Analysis (PCA) [17], Radial Coordinate visualization (RadViz) [18], self-organizing maps (SOM) [19], Ridge and Lasso Regression [20], [21], Singular Value Decomposition (SVD) [22]. Above two paragraphs and techniques were motivated to build a hybrid visualization model which is a collection of both multidimensional visualization models' characteristics and dimensionality reduction techniques' characteristics. Among the dimensionality reduction techniques, self-organizing maps showed better performance for clustering over other clustering techniques [23].
Interactivity can be embedded to a concept by using technologies like JavaScript, Ajax, HTML, Hadoop [24], NodeJS, etc. These technologies collectively introduce continuous connectivity in-between frontend and backend for enabling real time or near real time updates in the front end.
Since visualization is for humans, it is needed to consider human factors in visualization [25], [26]. Ability to operate in a low resources environment is a greater achievement with increasingly generating various types of data. This fact led research to design and implement a distributed architecture for information processing.

III. WALLBOARD
Below will demonstrate commonly available wallboard types with both open source and commercial contact center products. Asterisk can be referred under the free and open source contact centers. They may not provide identical wallboards as below figures ( Fig. 1 and Fig. 2). But, it will maintain the same concept.  Because of market competition, both open source and commercial contact center products tend to offer more informative dashboards or information visualization modules with their products. Presently, pattern recognition is a lacking and demanding area with information visualization which is related to contact center solutions.

A. Customer Information flow of a General Contact Center
Generally, contact center will consist with below major logical modules to cater demanding customer requirements.
• Greeting message: This is the first message customer can here. Example: Good morning.
• Skill: Skill is an attribute which is assigned to an agent. Agent Skills can be regard as the ability of an agent to handle a specific call which requires one of those skills.
In relationship with contact center, skill can be thought as a specific customer need/requirement or perhaps a business need of contact center. Contact center will define skills based on the needs of customers and contact center.
• Interactive Voice Response (IVR): It provides a list of steps that process calls in a user-defined manner. The steps in an IVR can send calls to skill, play announcements and music, disconnect calls, give a busy signal to the calls, or route calls to other destinations.
• Queue: Queue is a holding area for calls which are waiting to be answered in order. Different calls in a queue may have different priority levels, in which case, calls with a higher priority will be answered first.
• Agents terminal: This is the place were agents can pick and answer the customer's call. Based on the organization's requirement calls may be landed automatically to the relevant agent's terminal.

B. Placement of the Analyzer
As above diagram ( Fig. 3) shows, information flows through the above explained logical modules. Skill was embedded to agent who is using agent terminal and it is not separately drawn in the diagram. Selection of the information flow which is needed to process was done based on below facts (Table I).
• Relevance to the presented analysis • Richness of the information • Experience of contact center management • Experience of number management teams (Soft switch, etc.).
Based on the comments of number management team, due to central signal handling of phone calls, it introduces errors when grouping numbers into a geographical region by using number levels. For an example, number level 11111xxxxx -11119xxxxx is belong to town A and number level 11121xxxxx -11129xxxxx is belong to town B. Town A and B belongs to region X, since 111xxxxxxx assigned to region X. Because of the central management of signals (central switch), a number belongs to town A can be assigned to region Y. Because of this configuration, it dilutes the capability of localizing a regional failure due to a common reason which occurred in a particular region such as heavy rain.
Based on the experience of contact center management, contact center will be flooded from calls due to a failure in a region. In a such situation, we cannot experience a patient behavior from callers. All of the callers may have to wait longer minutes to reach agents. Most interesting observation is that caller is not capable to predict the source of failure and they will report what they feel. Because of that caller may select the required service based on their perspective. In this research, most significant information to analyze is queue statistic over the other available information when experiencing a sudden call hike. Based on the above explanation, we have selected information flow between queue and agent terminal to analyze.

C. Work Flow of the Proposed Software
Work flow (Fig. 4) was implemented with our solution to visualize results. As described by following figure, it will periodically calculate SOM and visualize its results. On top of the SOM's visualized content, it will place the present reading's 3D graphics. This is a periodic process with an appropriate delay to update visualized content or graphics.

D. Architecture of the Proposed Software
Below architecture (Fig. 5) consisted with two main modules such as Visualization layer and "SOM algorithm implementation" module. They are separately described in below subsections in detail. "DB interface" was designed to insert data originated via contact center. Below explained architecture was implemented by using multiple technologies collaboratively such as C#, PHP, MySQL, JavaScript, Three.js, WebGL, Java, etc.

A. "SOM Algorithm Implementation" Module
Kohonen Self-Organizing Map follows a two-layer approach which consists with input layer (top layer of above diagram (Fig. 6)) and output layer (bottom layer of above diagram (Fig. 6)). Both layers' neurons have same number of dimensions like ( 1 , 2 , 3 , , ) and ( 1 , 2 , 3 , , ). Output layer's neurons' weights will be updated by using a neighborhood function and input layer's weights. As per the algorithm, input layer's neurons will be selected randomly and iteratively to update output layer's neurons via neighborhood function. Once the input layer's neuron was selected, matching output layer's neuron will be selected as per the below function.
Here, i and j: output layer indexes (raw number and column number) Closest node, 1 and 2 : winning node's indexes Secondly, below equations will explain the neighborhood function which will be used to update neighbors and its calculations.

B. Visualization Module
Cartesian coordinate system or spherical coordinate system is capable of handling maximum three variables at a time. We have combined two coordinate systems in our novel framework to enable plotting more than three variables simultaneously. Advantages of this technique over other available 3D visualization approaches are as follows.
• Consistency of all the plotted parameters/dimensions • Ability to compare and contrast its own dimensions Above explanation will elaborate in detail in the results and analysis section by using screenshots of the novel framework. In other words, this is the module which visualizes the output of "SOM algorithm implementation" module. This is an independent software and visualization module can be used to upload data by using data files, if it is not integrated with any application to pump data.

C. Communication between Modules
Communication between visualization layer (web application) and "SOM algorithm implementation" module was facilitated by this layer. As below diagram (Fig. 7) explained, this layer is a composition of TCP socket server and TCP socket client which were constructed as independent executables with separate configuration files. These executables will exchange metadata and results between above mentioned two modules. Transmitting preprocessed data or SOM algorithm's output is excluded from communication layer protocol and they will be placed on a shared location or can be transmitted by using FTP based on requirement.  As per the below diagram (Fig. 8), contact center's Queue module can be integrated with "DB interface" to pump queue statistics data to database for the purpose of visualization and decision making. Upon successfully reaching an agent collected queue statistics will be sent to database via "DB interface". Then, visualization module can be loaded by using a preferred web browser to view output. It can be used as a wallboard to continuously monitor and analyze call patterns.

VII. DATASET AND DIMENSIONS
This section will explain the dataset which can be introduced as queue statistics and its structure in detail. Mainly it has two tables (Table II and Table III). Answered call load pattern table was filled with successfully answered calls by contact center agents while abandoned call load pattern table was filled with calls which is not answered by a contact center agent. Answered call load pattern table has columns like, • (0-5) which means number of callers who had to wait in a queue between 0 to 5 seconds before answered by a contact center agent in a given month.
• (6-10) which means number of callers who had to wait in a queue between 6 to 10 seconds before answered by a contact center agent in a given month.
• (11)(12)(13)(14)(15) which means number of callers who had to wait in a queue between 11 to 15 seconds before answered by a contact center agent in a given month.
• (>60) which means number of callers who had to wait in a queue more than 60 seconds before answered by a contact center agent in a given month.
Other than monthly cumulative call counts, it is possible to calculate cumulative figures based on 15-minute duration, hourly, daily and monthly. Our proposed software can accommodate all of the above mentioned cumulative figures. We have used daily and monthly cumulative figures to verify and calculate our approach's accuracy.
There may be many reasons to experience a sudden call hike in a contact center such as, • Credit control action. Operator will discontent customers who did not pay the bill as a bulk. In such a situation, customer may try to reach contact center by asking information related to his/her bill.
• Regional service failure. In such a situation, large number of affected customers will try to complain about service interruption via contact center.
• Natural event such as heavy rain. This kind of events may lead to service interruptions, because of damage to outside infrastructure and delayed maintenance.
We have selected to analyze rainy seasons with related to call hikes. This selection was solely made to evaluate our approach's accuracy. But, our approach can be used with any of the above mentioned reasons. Rain is a seasonal event and it was not started and ended in a same day of a month in two different years. Monthly cumulative call counts will provide better accuracy which is more than 95%.
Same dataset can be used with different durations to calculate cumulative figures to detect any interested reason which is contributing to a sudden call hike.
In contrast with answered call load pattern table, abandoned call load pattern table has columns like, • (0-5) which means number of callers who had to wait in a queue between 0 to 5 seconds before call is dropped in a given month.
• (6-10) which means number of callers who had to wait in a queue between 6 to 10 seconds before call is dropped in a given month.
• (11-15) which means number of callers who had to wait in a queue between 11 to 15 seconds before call is dropped in a given month.
• (>60) which means number of callers who had to wait in a queue more than 60 seconds before call is dropped in a given month.
212 | P a g e www.ijacsa.thesai.org Rain is a seasonal event and it will introduce a long term impact to a contact center. In our analysis, it was confirmed that there are no additional agents to flatten sudden hikes. In such a scenario, there is no considerable variation with the values of answered table while there is a considerable variation with the values of abandon table. This will be explained in detail in results and analysis section. Considering more dimensions for the research will introduce a requirement for high processing power with related to both calculations and graphics. In this research, we have selected abandoned table's columns as dimensions for our multidimensional data instances.
By reducing the duration for previously explained cumulative value calculations, dashboards can be converted to real time sudden call hike indicators. Although our proposed approach is independent from the dataset (our novel approach is generalized and we have used this dataset to verify and calculate our approach's accuracy which is more than 95%), domain knowledge is mandatory to select most appropriate dimensions and durations for an analysis.

VIII. RESULTS AND ANALYSIS
In this section, we are describing the results drawn via our novel approach which was described along with the dataset throughout several sections. We have selected queue statistics from historical data DB randomly for rainy months and normal months. Randomly selected data (multidimensional data arrays which was explained in previous section) were categorized by using SOM algorithm and visualized as below ( Fig. 9) (blue and green clusters for rainy and normal months).
Workflow of the proposed software was explained in a previous section. According to the workflow, next step is placing the current value which is a multidimensional data instance on top of the visualized SOM. Three white spheres imply three tests for the proposed application and it was verified that drawn conclusions against actual records. A sample of tests which were conducted to verify shown results in the visualization layer were presented in below table (Table  IV). It was achieved more than 95% accuracy level for detecting rainy conditions which may influence sudden call hikes. Although this testing was conducted to calculate accuracy and verify functionality of the proposed software, this approach was generalized to use with any different dataset and scenario.
White spheres consist with yellow bars which represent magnitudes of its own dimensions on its perimeter. Our concept's visualization layer was innovated to visualize multidimensional data with the help of SOM based categorization. Table V shows a comparison between our novel visualization concept vs groups of existing multidimensional visualization techniques.  As explained in the later part of the section DATASET AND DIMENSIONS, there is no considerable variation with the values of answered table while there is a considerable variation with the values of abandon table. As per the below figure (Fig. 10), there is a high fluctuation in abandoned call counts over answered call counts. Our selection of abandoned table's columns as dimensions was justified. This behavior was experienced mainly because of unavailability of extra contact center staff to cater a sudden demand. By combining two or more visualization and dimensionality reduction techniques, it is possible to have more features than sticking into a single visualization or dimensionality reduction technique. This can be concluded as the main finding of this research.

IX. CONCLUSION
Our approach showed improved results when predicting cause against traditional wallboards equipped with statistical analysis. SOM based approach shows better results when analyzing multidimensional data over traditional wallboards which shows charts with two or three axis. Results were highly depending on selection of dimensions correctly. In contrast with traditional wallboards, one wrong dimension selection may dilute the quality of overall prediction. Promising process for selecting dimensions will be a future work for this research.
ACKNOWLEDGMENT I want to thank University of Colombo School of Computing, Colombo, Sri Lanka for guiding me to successfully complete my research project.