Development of a Decision Support System for Handling Health Insurance Deduction

—Effective hospital management involves such activities as monitoring the flow of medication, controlling treatment, and billing for the patient's treatment. A major challenge between insurance companies and hospitals lies in the way medical treatment expenses for insured patients are reimbursed. In some cases, the insurance deduction leads to the loss of revenues by hospitals. This paper proposes a framework for the handling insurance deduction that integrates three major methodologies: Decision Support Systems, Data Mining, and Multiple Criteria Decision Making. To exemplify the practical utility of the framework, it is used to study hospital services and insurance deductions are extracted from 200,000 documents in 150 hospitals in Iran. To classify the kinds of services, decision trees are developed to mine hidden rules in the data which are then modified on the basis of some performance measures. The rules are then extracted and ranked using the TOPSIS method. The results show that the proposed framework is capable of effectively providing objective and comprehensive assessments of insurance deductions.


INTRODUCTION
Hospital management involves a most complex decision making process that has to deal with huge arrangements related to such administrative and medical operations as identifying patients, processing healthcare benefits for the inpatient, supporting administrative functions, facilitating payments for the services, and assisting insurance providers in their quest for in-depth records of actual treatments provided.In such complex management systems, past medical histories, problems, demographics, laboratory data, and basic information are incorporated into one single system in order to accelerate clinical studies and drug administration to patients [1,2].
In Iran, a plan was approved in 1985 for the autonomous management of hospitals, in which hospital costs are reimbursed from their own revenues.This made the financial management of hospitals more complicated than ever before.A majority of hospital revenues are reclaimed as per contracts with insurance companies which provide insurance policies to patients for hospital care and services [3].
A big challenge facing hospital managers is their transactions with insurance companies that are expected to reimburse to hospitals the costs of medical care and services provided to insured patients as deductions.In many cases, insurance companies do not completely reimburse the expenses despite their contractual obligations.The total costs the companies evaded to pay amounted to about 10 percent of hospital revenues in 2000.Consequently, hospitals sometimes have to make up for their budget deficits by increasing the portion of the costs covered by the patient due to losses incurred by insurance companies [3].
In this study, the term "health insurance deduction" is used to refer to the money not reimbursed by insurance companies for medical services provided by hospitals despite the contractual arrangements.Health insurance deductions happen mostly as the result of:  Lack of proper documentation on the services provided by hospitals;  Failure on the part of hospitals to submit full documents;  Mismatch of the diagnostic-related group (DRG) system to calculate the true costs; and  Additional services provided by hospitals such as drugs out of obligation, surgical services, unrelated diagnoses by doctors, and unrelated clinical tests.
Although, deductions could originate from different sources and for different reasons, this paper only focuses on hospital services and insurance obligations.For instance, insurance companies are obliged to reimburse the costs of delivery.In practice, if a mother is required to be hospitalized for more than 5 days, the costs for the extra days are not covered by the insurance companies.Or as another example, in the appendix surgery, insurance companies generally reimburse a certain amount of the cost that excludes the charges exacted under "difficulty of surgery" [3].
The objective of this paper is to develop a DSS with a methodologically comprehensive and easy-to-use framework for the financial management of hospital to handle the health insurance deduction problem.The proposed framework is then validated through a case study of 200,000 insurance deduction documents over the period 2009-2010 from 150 different hospitals in Iran.
The rest of the paper is organized as follows: the following section provides a brief review of the literature.Section 3 briefly describes the decision support system, data mining, and multiple criteria decision making methods used as the main www.ijacsa.thesai.orgmethods along with the decision tree and TOPSIS methods employed in the case study.Section 4 presents the integrated framework proposed in this study.Section 5 describes a specific application of the proposed framework.Finally, the paper concludes with results and suggestions in Section 6.

II. REVIEW OF THE LITRETURE
There are a variety of systems that can potentially support clinical decisions.Even Medline and similar healthcare literature databases can support clinical decisions.Decision support systems (DSS) have of long been incorporated into the healthcare information systems, but they usually have supported retrospective analyses of financial and administrative data [4,5].
Basole et al. [6] developed a health advisor system which is a web-based game using organizational simulation in which players are tasked to manage people through the healthcare system by using various information, costs, and quality of care trade-offs with scores based on health outcomes and costs incurred.Gillies et al. [7] determined items that different stakeholder groups view to be important for inclusion in a DSS for clinical trial participation; with a view to use these as a framework for developing decision support tools in this context.North et al. [8] studied the research efforts in clinical DSS to compare triage documentation quality.Martínez-Pérezet al. [9] analyzed a sample of applications in order to draw conclusions and put forth recommendations about the mobile clinical DSS.Mobile clinical DSS applications and their inclusion in clinical practices have risen over the last few years.The authors found that the interface or its ease of use would impoverish the experience of the users if developers did not design them carefully enough.
Data Mining (DM) has been the most important tool used since 1990 for knowledge discovery from large databases.Recently, sophisticated DM approaches have been proposed for similar retrospective analyses of both administrative and clinical data [10,11].The use of DM to facilitate decision support provides a new approach to problem solving by discovering patterns and relationships hidden in the data, giving rise to an inductive approach to DSS.Roumani et al. [12] compared the performance of several common DM methods, logistic regression, discriminant analysis, Classification and Regression Tree (CART) models, C5, and Support Vector Machines (SVM) in predicting the discharge status of patients from an Intensive Care Unit (ICU).The nonexpert users who tried the system obtained useful information about the treatment of brain tumors.Zandi [13] developed a bilevel interactive DSS to identify DM-oriented Electronic Health Record (EHR) architectures.The bi-level Interactive Simple Additive Weighting Model was then use to help medical decision makers gain a consensus on a DM-oriented EHR architecture.Bashir et al. [14] proposed the effectiveness of an ensemble classifier for computer-aided breast cancer diagnosis.A novel combination of five heterogeneous classifiers, namely Naïve Bayes, Decision tree using Gini Index, Decision Tree using information gain, Support Vector Machine, and Memory-based Learner were used to make the ensemble framework.
Remarkable progress has been made during the past 40 years in the Multiple Criteria Decision Making (MCDM) method so that it has nowadays developed into a mature discipline [15].Recently, researchers have employed this method in a variety of areas including DM. Narci et al. [16] analyzed the effect of competition on technical efficiency through Data Envelopment Analysis (DEA) with five outputs and five inputs for the hospital industry in Turkey.Kusi-Sarpong et al. [17] introduced a comprehensive framework for green supply chain practices in the mining industry and presented a multiple criteria evaluation of green supply programs using a novel multiple criteria approach that integrates rough set theory elements and fuzzy Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS).Aghdaei et al. [18] identified the synergies of DM and MADM and presented a wide range of interactions between these two fields from a new perspective.They provided an example of the integrated approach in supplier clustering and ranking.
Clearly, incorporation of DM and MCDM in decision support issues yields more powerful DSS since it offers more options for analysis, uses expert knowledge, and improves upon the process of analysis and evaluation [19].

III. METHODOLOGY
In this section, the method used in the proposed framework and its implementation such as decision support systems, data mining, decision trees, multiple criteria decision making, and TOPSIS are briefly described.

A. Decision Support System
Decision Support System (DSS) is a new computerized application serving organizational and business decision makers in their decision making process.The system is capable of extracting and collecting useful information from documents, business models, and raw data.It can even help solve problems and make useful decisions.The system is typically used for strategic and tactical decisions of a reasonably low frequency and high potential consequences for the upper-level management.The use of this system pays generously in the long run due to the short time taken for thinking through and modeling the problem [4,5].The three fundamental components of the DSS are as follows [20].
 A Database Management System (DBMS).DBMS serves as a data bank for DSS.It stores large quantities of data relevant to the class of problems for which the DSS has been designed and provides logical data structures through which the users interact.
 Model-base Management System (MBMS).The role of the MBMS is analogous to that of a DBMS.Its primary function is to keep specific models used in a DSS independent from the applications that use them.

 Dialog Generation and Management System (DGMS).
The main product of an interaction of DGMS with a DSS is insight.As their users are often managers who are not computer-trained, DSS needs to be equipped with intuitive and easy-to-use interfaces.www.ijacsa.thesai.org

B. Data Mining
Data mining (DM) is a popular technique for searching for and extracting interesting (i.e., non-trivial, implicit, previously unknown and unexpected potentially useful) and unusual patterns from data sources.DM problems are often solved by using a mosaic of different approaches drawn from computer science including multi-dimensional databases, machine learning, soft computing, and data visualization.Use is also made of statistics in terms of hypothesis testing, clustering, classification, and regression techniques [10,11].

1) Decision Trees:
A popular DM technique is the induction of decision trees.A decision tree (DT) is a machine learning technique used in classification, clustering, and prediction tasks.There are different tree-growing algorithms for generating DT such as C5.0, C&R trees, CHAID, and Quest [10,11].A DT starts from the root node which is one of the best attributes.Property values are then generated that correspond to each branch which generates a new node.For the best attributes according to the selection criteria, it uses an entropy-based definition of the information gain to select the test attribute within the node.The entropy characterizes the purity of a sample set.Suppose S is a set of data samples.We assume that the class label attribute has m different values, the definition of m different classes being C i (i=1, ..., m), and set S i is the number of samples in the class C i .( 1) is the sample classification based on expected information: where, P i is the probability of any sample belonging to C i , which is estimated using Si/S.

The set attribute A has different values { }. A property can be divided into subsets S{
+, where Sj contains a number of S values in this sample and they have a value of aj in A. If we select the test attribute A, these subsets correspond to set S, which contains nodes derived from the growing branches.S j assumes that S ij is a subset of the samples of class C i .Thus, A can be divided into subsets of entropy or expected information, which is given by (2): where, the item ( )/S subset is on the right of the first j and is equal to the number of subsets of the sample divided by the total number of S in the sample.(3) is a given subset for S j : where, P ij =S ij /|S j | is a sample of S j based on the probability of belonging to class C i .( 4) is a branch that will be used for encoding information.
In other words, Gain(A) is attributable to a value of that property because of the expectations of the entropy of compression.Thus, a smaller entropy value leads to a lower correlation, whereas a higher corresponding information gain produces a subset of the division with a higher purity.Therefore, the test attribute DT selects the properties with the highest information gain.This creates a node and marks the property, where each value of the property creates a branch and divides the sample accordingly.
The DT contains leaves, which indicate the values of the classification variable, and decision nodes, which specify the test to be carried out.For each outcome of a test, a leaf or a decision node is assigned until all the branches end in the leaves of the tree [21,22].

C. Multiple Criteria Decision Making
Multiple Criteria Decision Making (MCDM) is a subdiscipline of operations research that explicitly considers multiple criteria in decision-making environments.MCDM is concerned with structuring and solving decision and planning problems involving multiple criteria.In general, multiple criteria problems can be divided into two categories: Multiple Alternative Decision Making (MADM) and Multiple Objective Decision Making (MODM) problems.Typically, there is no unique optimal solution for such problems and it is necessary to use decision maker"s preferences to differentiate between solutions [15,23].

1) TOPSIS:
The Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) method is a popular approach to MADM that has been widely used in the literature.Presented by Hwang and Yoon [23], it consists of the following steps [24,25].
Step 1: The decision matrix is normalized through the application of (5): Step 2: A weighted normalized decision matrix is obtained by multiplying the normalized matrix by the weights of the criteria, ( 6): (6) Step 3: PIS (maximum value) and NIS (minimum value) are determined by (7).
Step 4: The distance of each alternative from PIS and NIS is calculated using (8): Step 5: The closeness coefficient for each alternative (CC i ) is calculated by applying (9): (9) Step 6: At the end of the analysis, the ranking of alternatives is made possible by comparing CC i values.www.ijacsa.thesai.orgIV.THE PROPOSED FRAMEWORK In this section, the proposed decision making framework for the health insurance deduction handling will be presented in detail.
To implement the integrated framework, an expert committee is first called in to extract a comprehensive list of healthcare services for patients in different cases, facilitated payments for the services, and an in-depth record of the actual treatments processed.Fig. 1 shows the deployment diagram by integrating DSS, DM, and MCDM to make powerful, reliable, and efficient decisions in the insurance deduction handling.To facilitate the operations, the steps have been classified into four modules.Detailed descriptions of the modules and their steps are presented below.

A. Data Management Module
The hospital document system usually uses a computer system with a set of programs to track and store all the documents and instructions related to the health system [1-3].These documents are usually provided by the hospital discharge, accounting, and statistical agencies and which should be considered as longitudinal registration data.The complete architecture of the data registration is shown in Fig. 2. In general, there are three categories of data for integration in hospital documents (Table 1):  Demographic Data,  Clinical Data, and  Financial and Administrative Data.In the data selection step, not all the measured items should be selected from the database; unusable variables need to be discarded to save time and space while they may also yield wrong results which could be misleading to final users.
In the data preparation stage, the data are pre-processed and cleared for analysis.Examples of this are:  Integrating the coding policy like DRG or ICD10-CM [26][27][28][29],  Transforming some variables, such as the text data from the initial description of the pathology tests [30], and  Dealing with missing and outlier data [30].
Data inspection is the final step in the data management module, in which the structure of the prepared data set is checked for the analysis of needs and their required tools.

B. Data Analysis Module
The objective of the data analysis module is to help hospital managers and insurance providers determine the characteristics of the relevant situations and predict future cases of insurance deductions by analyzing the available cases through a combination of DM and MCDM functions.
To overcome the existing problems, this module employs the data thus far prepared for:  Classifying bill service data to predict actual treatment costs;  Discriminating diseases to determine treatment costs;  Using association rules and contingency tables of treatment costs and demographic data to study their possible relationships; and  Using cluster analysis and frequent patterns to extract the patterns of causes of insurance deductions.

Moreover, financial and clinical analysis may be used:
 To study the relationships among the tests for specific diseases prescribed by different physicians using association rules and frequent pattern recognition.This leads to the identification of efficient from non-efficient tests, the results of which can be used for cost management and determination of the rate of unrelated diagnoses by each physician.
 To classify all types of services offered in order to identify the necessary orders and supplies such as drugs, visits by physicians and specialists, and pathology tests to support administrative functions;  to evaluate the priority of development activities in hospitals based on prioritized utility functions; and  To predict total hospital expenditures for different seasons and months using temporal mining and time series analysis.

C. Evaluation module
Depending on the type of analysis required, use will be typically made of statistical criteria, training and test datasets, cross validations, or the like for the evaluation of the results obtained.
Furthermore, the proposed framework uses MCDM techniques and decision maker opinions for evaluation.For instance, MADM methods such as AHP, ANP, ELECTREE, and TOPSIS could be employed to evaluate and rank the results.Programming and genetic algorithms will be more efficient when using a scoring system for performance and optimization as in the assessment of insurance deductions.

D. User interface module
The user interface module should present a comprehensive view of the decision making process depending on the requirements put forth by managers and administrators.Poor usability is one of the core barriers to adoption of a system, acting as a deterrent to DSS routine use.
Generally, the following points should be considered in the design of the interface for the hospital document management system regarding the insurance deductions handling:  Monitoring the data collection process and its integration;  Monitoring each step of the data management module;  The possibility for employing different DM and MCDM methods for each type of data depending on the objectives of data analysis and inspection;  Presenting the results in accordance with administrators" needs and requirements;  The possibility for evaluation of the results obtained from the data analysis module including MCDM methods; and www.ijacsa.thesai.org The possibility for sensitivity analysis and evaluation of several scenarios by decision makers.

V. IMPLEMENTATION
In this section, the efficiency of the proposed framework is investigated by using it to predict the most likely services which lead to insurance deductions in different hospitals.For this purpose, the information from 200,000 documents for patients hospitalized in 150 different hospitals over the period from 2009 to 2010 is integrated to create around 97,532 records.
In the data selection step, different types of hospitals were considered.Also, the documents were chosen using International Classification of Diseases (ICD), ignoring emergencies and accident cases.
In addition, transformation and normalization were used in the data preparation process.As most of the records included very low deductions, biases of the model were avoided by considering ROD as zero if the rate of deduction (ROD) was less than 3 percent.The selected features after data inspection are presented in Table 2.The focus here was on the data analysis module.Given the goal of decision making, the Decision Tree (DT) was exploited to predict insurance deductions from types of hospital services [10,11].In this case, the algorithms of C5.0, C&R trees, CHAID, and Quest were applied and a 10-fold cross validation was used.Also, for estimating the performance of the predictive models, the records of 2009 (about 53,795 cases) were used as the training dataset while those of 2010 were used as the validation dataset.The results obtained are reported in Table 4.

TABLE IV. COMPARISON OF THE RESULTS OBTAINED FROM THE ALGORITHMS USED
As the purpose of this analysis was to extract reliable, useful, and meaningful rules for managers and administrators of hospitals and insurance providers, the huge number of patterns (1721 rules) discovered did not seem sensible or usable.The human brain is reportedly incapable of processing a large number of logical phrases and rules as it will be hard for it make good sense out of it [31,32].The evaluation step was, therefore, applied to prioritize the rules extracted.In this study, certain important performance measures were initially defined and the TOPSIS method was used to rank the rules that could be extracted.Thus, the following concepts were defined as performance measures:  Accuracy (ACC): The correct classification rate of the rule based on the test dataset.
 Stability (STAB): Not a great variation is allowed in the accuracy rate when a rule is applied to different datasets.Thus, one might minimally expect that a rule does not exhibit a great variation when applied for the validation dataset or the training dataset.Then, STAB = Min { ACC t / ACC v , ACC v / ACC t }.
 Simplicity (SIMP): This limits the number of attributes in a rule.

 Discriminatory Power (DP):
The ratio of discriminated cases for the rule; ideally one would like to have rules (leaves) that are totally pure (i.e., all the classes except for one has a zero probability for each leaf) but in many cases this does not occur and so the class that is associated with the rule (leaf) is simply the class with the largest frequency for the given rule based on the training dataset.
 ROD: The ratio of deduction of the rule to the total amount.
As already mentioned, the best alternative in the TOPSIS approach is the one nearest to the ideal solution and the one farthest from the negative ideal solution.Also, it is assumed  In this Table, the columns for the criteria defined are normalized scores of each rule, d* is the deviation from the ideal alternative, d -is the deviation from the negative ideal alternative, and CC is the relative closeness to the ideal solution.All the rules were then sorted based on the CC column from the TOPSIS calculation and the most important rules were extracted for planning and decision making by managers and administrators of hospitals and insurance providers.Some of the results are presented in Table 6.Using these rules and information, hospital managers can revise their policies for similar cases as to how to reimburse the expenses of their medical services and to negotiate with insurance providers on how to deal with insured patients receiving similar services.Moreover, insured patients can in this way be fully informed about the services covered by insurance companies.

VI. CONCLUSIONS
Hospital management is a most complex decision making process that has to deal with huge arrangements related to such financial and administrative process, medical operations, and the patient services, etc.The decision support system is an effective technology that makes it possible to properly respond to such hospital management requirements.
One major challenge commonly arising between insurance companies and hospital managers is the disputes and disagreements over the reimbursement of medical expenses of insured patients.A majority of hospital revenues are reclaimed as per contracts with insurance companies which provide insurance policies.
The "health insurance deduction" is referred to the money not reimbursed by insurance companies for medical services provided by hospitals despite the contractual arrangements.This paper presented an integrated framework for handling health insurance deduction based on DSS, DM, and MCDM methodologies.
Nowadays, decision makers invariably need to use DSS to tackle complex decision making problems.In this area, DM plays an important role in extracting valuable information.Also, MCDM method deals with such varied areas as choosing the best option among various alternatives and optimizing goals among multi-objective situations.
The proposed framework is capable of achieving enhanced decision making performance, improving the effectiveness of solutions developed, and enhanced possibilities for tackling new types of problems not addressed before.Application of the proposed method to a case study yielded objective and comprehensive results which assist hospital managers to negotiate with insurance providers on how to handle the insurance deduction.
In the forthcoming work, we will apply the proposed framework in other aspect of hospital management, medical diagnosis and possibly other applications in the near future.

TABLE I .
SOME SELECTED ITEMS IN EACH CATEGORY OF HOSPITAL DOCUMENTS

Table 3
presents the distribution of deductions according to types of services.As can be seen, almost half the insurance deductions belonged to medications, laboratory test charges, and supplies used.

TABLE VI .
THE FINAL RESULTS