A Literature Review on Medicine Recommender Systems

—Medicine recommender systems can assist the medical care providers with the selection of an appropriate medication for the patients. The advanced technologies available nowadays can help developing such recommendation systems which can lead to more concise decisions. Many existing medicine recommendation systems are developed based on different algorithms. Thus, it is crucial to understand the state-of-the-art developments of these systems, their advantages and disadvantages as well as areas which require more research. In this paper, we conduct a literature review on the existing solutions for medicine recommender systems, describe and compare them based on various features, and present future research directions.


I. INTRODUCTION
Hospitals have access to vast amount of data about patients and their health parameters.Thus, there is a need for convenient way for medical professionals to utilize this information effectively.An example would be the access to aggregated information from existing database on a specific problem at the point of care when it is necessary.Moreover, there are more drugs, tests, and treatment recommendations (e.g.evidence-based medicine or clinical pathways) available for medical staff every day.Thus, it becomes increasingly difficult for them to decide which treatment to provide to a patient based on her symptoms, test results or previous medical history.On the other hand, all these data can be used to strive personalized healthcare which is currently on the rise and predicted to get a major disruptive trend in healthcare in the upcoming years.
Therefore, a recommendation engine for medical use could be employed to fill this gap and support decision making during therapy.Based on a patient's current health status, prehistory, current medications, symptoms and past treatments, the engine can look for individuals with similar parameters in the database.At the end, the recommender system would suggest the drugs that were most successful for similar patients.With the help of such a system, the doctor will be able to make a better-informed decision on how to treat a patient.IBM's artificial intelligence machine Watson Health [8] is already able to find a suitable treatment for patients based on other patients' outcome and evidence-based medicine.IBM claims that 81% of healthcare executives familiar with Watson Health agreed that it has a positive impact on their business.This demonstrates that using technology and analytics become increasingly important in healthcare.
In this paper, we review the existing medicine recommendation system solutions, and compare them based on various features.The goal is to demonstrate the existing solutions for the healthcare providers in order to improve the medicine selection process and select an appropriate medication for the patients.
The rest of this paper is organized as follows: The methodology for the literature review is presented in Section 2. In Section 3, we discuss the findings.Section 4 presents the limitations.Finally, Section 5 concludes the paper and presents the future work.

II. RESEARCH METHOD
We conducted our literature review in several steps.We followed the guidelines defined in [9].First, we defined search terms based on population, intervention, outcome of relevance and experimental design.However, we concluded that for our approach the population contains all healthcare facilities.Since this population is so comprehensive and non-specific, we excluded keywords about the population.This resulted in the following major keywords:  Intervention: medication recommendation system  Outcome of relevance: system for medication recommendation  Experimental Design: empirical studies, systematic literature reviews, solution descriptions The intervention and outcome of relevance category are the same.Therefore, they were only included one time.Once this has been agreed on, the search algorithm was constructed.The logical operators AND as well as OR were used to combine the search terms defined in the previous step.The following synonyms were considered: To verify the algorithm and the terms used, we www.ijacsa.thesai.orgconducted a test for some papers we knew already.The test was successful as we could find relevant papers.
Afterwards, we chose the databases to search in based on available access which led to five databases: We also agreed on using Google Scholar as a search engine as a sixth source because it provides results from a high variety of databases which we might not have included and thus, can lead to a higher quantity of relevant papers.
Then, we agreed on inclusion and exclusion criteria which are defined as follows:  Inclusion criteria: 1) Conference Proceedings and Journals published after 1999 2) Studies focusing on medicine recommendation systems in general and/or specified for any disease 3) Studies focusing on medicine recommendation systems based on graph databases  Exclusion criteria: 1) Papers published before 2000 2) Manuscripts written in another language than English 3) Technical reports and white papers as well as Graduation projects, Master thesis and PhD dissertation 4) Textbooks (print and electronic)

5) Studies in other domains of knowledge
Finally, some quality criteria for the papers which met the inclusion criteria were defined to guarantee a selection of highquality papers only.A scored system was used.For each of the following criteria met, a paper is assigned one point:  Logical and reasonable in results and findings regarding the domain of knowledge  Clearly stated objectives, results and findings regarding the domain of knowledge  Well-presented and justified arguments  Reasonably tested and/or applied system  Well referenced with a minimum of ten sources Only papers which met all criteria, thus had 5 points, were included in the final selection.

III. RESULTS AND DISCUSSION
This section describes the final collection of papers in more detail, compares them with each other regarding different parameters, summarizes their approaches and defines research gaps.Table I presents the numbers of papers for the initial and final phase as well as the rate of included papers in percent.Also, to create more transparency, we included a column per phase for the results of the search with Google Scholar.
As shown in Table I, 52 documents were included initially.After the screening and cross-evaluation as described above, 13 documents remained.The IEEE database has the highest inclusion rate with 5 from the initial papers being included in the final set (71%).From the initial 22 papers from the ACM database 3 remained, leading to an inclusion rate of 14%.Also, it is surprising that none of the papers of ScienceDirect, Elsevier and John Wiley Inc. could be included in the final set of documents.One reason for that might be the publications about recommendation systems are not published in those databases, maybe because the editors of those publications prefer other journals and conferences.
Furthermore, for Google Scholar, there were 11 different publication venues included in the initial selection, from which 5 were included in the final selection.Five of the initial papers retrieved from Google Scholar (29%) were published in IEEE.In the final selection, 44% of the papers from Google Scholar were published by the IEEE.This leads to the conclusion that medicine recommendation systems are a widely discussed topic with many specifics which can also be recognized by the different areas of the journals, but the IEEE database seems to be the most attractive venue for publication in this area.However, in general, approaches and techniques related to medicine recommendation systems are published in a high variety of journals.
Specific journals, such as the Journal of Biomedical Semantics, seem to be promising for future literature research in this area, although the results retrieved via the IEEE database met by far the most the inclusion and quality criteria.Moreover, more than 50% of the documents from Google Scholar were included.This is not surprising since Google Scholar fetches documents from many different databases.Despite the strict quality criteria of our study, 25% of all initially selected papers were included in the final set of papers.This indicates a high quality of the databases searched in.

A. Categorization of Approaches
All the papers included in the final selection are categorized and summarized.

1) Ontology and rule-based medicine recommendation systems:
The drug recommendation system GalenOWL [4] is based on the Greek drug guide GALINOS where doctors can search for a drug and find details on the drugs and additional information, such as interactions with other drugs.The paper describes a system that recommends drugs for a patient based on the disease of the patient, allergies and known drug interactions for the drugs in the database.To recommend the best fitting drug, rules for medications and interactions are stored in the system, which is based on ontologies, ICD-codes and other information.The application is accessible via the browser.www.ijacsa.thesai.orgThe drug-drug and drug-interaction discovery framework Panacea [5] is based on the approach GalenOWL and uses standardized medical terms and a rich knowledge base which are both modeled as rules.They used SKOS vocabulary, an ontology and reasoning engine and a medical and rules-based reasoning approach.The results show that Panacea is a promising solution, but still needs some improvement.
SemMed [14] which is a medical recommendation engine based on Semantic Web Technologies, applies an ontologybased approach.It consists of an inference engine, a rules manager, a support database, and ontology manager.The core classes "Diseases", "Medicines" and "Allergies" were used to develop rules.
Another solution proposed by [2] utilizes an ontology for anti-diabetic drug recommendations.However, it also includes the Multiple Criteria Decision-Making approach to compute weights and rank the drugs.It mainly utilizes laboratory data, but also considers risk and benefit factors.[11] is a drug recommender system which was specifically developed to individualize patient treatment of type 2 diabetes mellitus patients.The solution combines rulebased decision making with ontologies and semantic web technologies while taking specific patient information, such as the individual HbA1c target, into consideration.

IRS-T2D
Chen et al. [3] used semantic web rule language to describe the relationship between the rules retrieved from AACEMG.With the rules and knowledge from patient ontology and medicine ontology an inference is derived utilizing the Java Expert System Shell.The inference is then displayed in the system interface.

2) Data mining and machine learning-based medicine recommendation systems:
The approach proposed by Sun et al. [15] analyzed EMR records to detect typical treatment regiments and measures (quantitatively) the effectiveness for those regimens for specific patient cohorts.The authors measure the similarity between the treatment records in the EMR, cluster similar ones to treatment regimens based on Map Reduce Enhanced Density Peaks based Clustering, extract semantically meaningful information for the doctor and estimate the treatment outcome for a patient cohort for a typical treatment regimen.The results of applying this approach in an empirical study show that the effective rate of the patient increases as well as the cure rate.
Hamed et al. [7] utilized Tweets from Twitter to analyze the well-being of the Tweeter and to give recommendations about alternative medicine possibilities.Therefore, the authors get the information of the Tweets, send the Tweeter a questionnaire to get more information about her state and apply www.ijacsa.thesai.org a trained C4.5 decision tree algorithm to predict the condition of the user.Based on that, the algorithm can derive a recommendation for an alternative medical product.
DiaTrack was developed by Medvedeva et al. [12] as a drug recommendation system for type 2 diabetes and intends to give doctors a dashboard where they can see similar patient cases and their reaction to a drug or other factors.Therefore, the system compares the disease pattern of multiple patients and gives back the results in a color-coded, easy to understand graph.
The approach proposed by Kushwaha et al. [10] describes a drug recommendation system based on semantic web technology and data mining algorithms.Those two methods were combined to first extract semantic data and then apply data mining algorithms on those data.Data mining algorithms were used to individualize the treatment dependent on the patient's attributes.The system will not recommend drugs which the patient took before or that would interact with drugs the patient took before.
A hybrid framework to recommend drugs by ranking is proposed in [16].Practitioners make inquiries and order lab tests.Information about this patient is entered into the system as a new case during the process.The system will process the new data and extract patient features.A diagnosis is made based on the patient's problem.The diagnosis is matched to a specific disease category in the system to determine which symptom-drug classifier to use.Patient features in the new case are put into the classifier to predict which drug cluster/clusters to choose for this patient.Drugs in each cluster will be ranked by the ranking module to form the final recommendation list.
Mahmoud et al. [1] investigated three different algorithms: Support Vector Machine (SVM), Back Propagation neural network, and ID3 decision tree to find out which algorithm is optimal for a drug recommendation framework.The evaluation is based on scalability, accuracy, and efficiency.Since accuracy is the most important criteria for recommending a drug, the SVM algorithm was identified as the most useful algorithm.The next steps are to implement the model along with the data preparation, visualization, and database system module.Another drug recommendation system is a cloud-based platform utilizing various algorithms [17].Using the vector service model, the drug character is formatted according to the description of the drug information.Then a k-means algorithm is applied to cluster drugs.Subsequently, an evaluation using collaborative filtering leads to recommendations.Finally, tensor decomposition is applied to address sparsity and massive data, shortcomings of collaborative filtering.This multi-step process helps to make an accurate recommendation.

B. Characteristics of Approaches
Tables II, III and IV present the results of the literature review.The columns refer to the different dimensions we compare the studies with.In each table, there is a short summary and discussion of each column presented in the corresponding table.
In Table II, the column -Disease‖ describes whether the concept described in the study focuses on a specific disease.
-Data storage‖ summarizes the method applied to store the data.-Interface‖ refers to the connection of the back-end modules with each other and the front-end.The column -Data collection‖ describes the sources the data used for testing the approach, if applicable, were gathered from.II, most studies do not focus on a particular disease.This shows that most work in this field attempts to develop a general-purpose recommendation engine.Finding a recommender system that will work for all diseases would be very useful for general practitioners.However, all studies dealing with a specific disease focus on drug recommendation for diabetes.This means that this type of disease seems to be relatively important and well suited for a drug recommendation system.Since a highly-individualized treatment is required for diabetes, this is also reasonable.

1) Disease: As shown in Table
2) Data storage: Data storage is not widely discussed in the studies we reviewed.5 out of 13 papers mention their data storage approach for datasets such as patient data and drug data.This shows that mostly it is preferred to focus on selected parts of the solution, such as the algorithm.For the studies that include data storage, they all have different ways to store data sets.
GalenOWL [4] stores data in RDF graphs and utilizes SPARQL queries whereas the SWRL [3] leverages a software called Protégé [6] to store its data.Author in [17] utilizes cloud storage services and the IRS-T2D [11] applies ontologies and semantic web technologies.This shows that there is no standardized approach to store data although the data sets comprise similar data from the electronic medical record (EMR).
3) Interface: Little information is provided about the interface of drug recommendation systems.The focus is on the recommendation algorithm.Two studies utilize Protégé for the interface and semantic web rule language to show the output of the result of the algorithm.On the other hand, DiaTrack [12] leverages dynamic-service middleware to provide a visualization of the output.This shows that drug recommendations are still in development.Generally, once the recommendation engine is defined, it seems like the focus is on the user experience.Moreover, for the studies that did provide information about user interfaces, Protégé is the framework mostly applied.Protégé [13] is an open-source ontology editor that provides developers with a user interface to create intelligent systems.Developed by Stanford University, this application is appealing due to its free-to-use license terms, its active community for support and its extensible environment.www.ijacsa.thesai.org In Table III, the column -Data preparation‖ relates to the steps applied to the raw data so they fit for the algorithm.
-Platform/Technology‖ refers to the technology and/or platform utilized for the implementation.

5) Data preparation:
In Table III, we found that the studies apply different data preparation procedures, for instance regarding data formatting and cleaning.This seems reasonable if we assume that the format of data is different across the different studies.Furthermore, the table shows that most studies utilize data preparation modules to provide an acceptable format for their algorithm module.Studies such as [1], [16] or [2] use normalization techniques to uniformly scale the data across the modules.On the other hand, some authors decided to manually prepare the data.In the case of T-Recs [7], tweets were distinguished to be either relevant or irrelevant.The study in [15] shows the most relevant medicines were selected and then divided into four periods.Since most studies have information about data preparation, it shows that this aspect is essential when developing a recommendation engine.
6) Platform/Technology: With regards to the platform/technology of the recommender systems, three of the studies use online services.CADRE [17] utilizes the cloud platform to give medicine recommendations based on symptoms.The LOD cloud mining study by [10] leverages semantic knowledge from the LOD cloud.The GalenOWL [4] uses semantic-enabled online services to provide drug-drug and drug-disease interaction discovery.Furthermore, T-Recs [7] utilizes Twitter to monitor tweet sentiment, create an analysis for the tweets, and the calculate recommendations.Other studies apply rule-based inference engines such as Pellet and Jena/Drools.In all 13 studies, the technology used to apply the algorithm was different.This shows that researchers do not restrict themselves to apply only one specific software tool, but utilize the various possibilities available.Despite this flexibility, scientists need to consider the costs of the technology to make it reasonable for an average hospital to purchase it.Hence, open-source software which was used to develop algorithms such as Protégé, Pellet and Jena rule engine seem to be reasonable and preferable choice.
Table IV compares the studies in terms of the algorithms used, and presents future work identified by these studies.The algorithms used in the reviewed studies were described earlier in Section III.

Data preparation Platform/Technology
Data-driven Automatic Treatment Regimen Development and Recommendation [15] Yes (select most relevant medicines (138); divide treatment record into 4 periods) Custom Panacea, a semantic-enabled drug recommendations discovery framework [5] Yes (applying the SKOS vocabulary) Querying instance and knowledge base Rule engines (Jena/Drools rule engine) SemMed [14] No information

Inference engine Rules manager Support DB and ontology manager
The recommendation of medicines based on multiple criteria decision making and domain ontology [2] Yes (normalization of benefit and risk factors) No information T-Recs [7] Yes (Manual distinction between relevant and irrelevant tweets; grouping of tweets) Twitter Tweet Sentiment monitor Tweet analysis and computing recommendation GalenOWL [4] Yes (ATC, ICD-10, UNII, Substances, Conditions, Indications-Contraindications) Online-service IRS-T2D [11] No information No information DiaTrack [12] No information A standard web-browser front-end for Data Entry, Research, Practice Administration and Site Administration LOD Cloud Mining for Prognosis Model [10] Yes (Queried with SPARQL) LODD cloud, queries on drug data with SPARQL 1.1 with Java IDE, database: RDF dump stored in Sesame, app uses the server of Sesame A framework of hybrid recommender system [16] Yes (Text Mining Module, Data Normalization, Drug Clustering Module) No information A recommendation system based on domain ontology and SWRL for anti-diabetic drugs selection [3] Yes (Inference engine (Pellet) transformed the format acceptable to the recommendation system) Medicine ontology was created by Protégé Inference engine (Pellet) An Intelligent Medicine Recommender System Framework [1] Yes (Data normalization using min and max functions.Correlation analysis using Chi-Square Tests.) No information CADRE [17] No information Cloud www.ijacsa.thesai.orgThis extensive literature review shows that there are many solutions for drug recommendation systems.Most of them are based on manually constructed ontologies and use sophisticated data mining or machine learning methods.Especially the processes including manual work are very time consuming.Also, none of these approaches utilizes a graph database to model the relationships between patients and to apply an algorithm to this model, although this might be a well-suited approach.Graph databases can model the data in graphs which is a more natural way to store data than any other database offers.Medical institutions usually have many patients who can be illustrated in a graph as a network of patients.Thus, this approach may be superior to the ones discussed earlier in this paper and also addresses the last topic listed for future research.The reasons for that are the unique features of graph databases, such as high consistency and high scalability.

IV. LIMITATIONS
Our literature review has two main limitations, namely, the paper selection and content.Out of 52 papers, only 13 were reviewed based on the strict inclusion, exclusion and quality criteria we chose.Along with the strict search criteria, the systematic review included papers from a limited number of databases.However, we used six main databases that are well known.
Some papers offer little detail on the exact implementation and architecture of the solutions built.This made it more difficult to assess which applications were used to build the system.Also, some papers proposed only a theoretical solution on how to recommend a drug such as [16], but did not implement the solution.On the other hand, some papers did implement the solution such as [2], but no evaluation was made on the performance.Therefore, several questions stay under investigation, such as "how accurate are recommender systems?" and "does it reduce the symptoms patients have?"

V. CONCLUSIONS AND FUTURE WORK
This paper presented a systematic literature review for medicine recommendation engines.We reviewed 13 studies that met our strict criteria in six different databases.These studies can be split into two categories: (i) machine learning and data mining-based, and (ii) ontology and rule-based approach.The studies were summarized and evaluated across several parameters: diseases, data storage, interface, data collection, data preparation, platform/technology, algorithm, and future work.Most of the studies that did not focus on any disease, had less information about data storage, interface, data collection, data preparation, platforms and technology, and customized algorithms.
For future work, our review suggests to extend the existing solutions by adding recommendations for the dosage of drugs, as well as building highly scalable solutions.Also, based on the evaluation, we identified that none of the studies we reviewed include a graph database in their solution for a drug recommendation system.Graph database such as Neo4j seem to be very suitable for drug recommendation engines because they are highly scalable and consistent which would account for the last of the aforementioned topics for future work.Furthermore, their data model seems to be promising for recommendation systems due to their network structure and ease for querying.Hence, another direction for future research would be the creation of medicine recommendation engines based on graph database.

TABLE . I
. NUMBER OF PAPERS FOUND IN THE RESPECTIVE SEARCH ENGINES (INCL.GOOGLE SCHOLAR RESULTS AND FINAL INCLUSION RATE)

TABLE .
IV. RESULTS OF LITERATURE REVIEW IN TERMS OF ALGORITHMS AND FUTURE WORK PRESENTED IN EACH STUDYAs the table indicates, almost all of the approaches (9 out of 13) state some future work and research areas.Although some of them are rather specific to the technique presented in the paper, it is possible to derive some general fields where more research is required.The three main areas based on this literature review are:  Verifying the results, e.g. by increasing testing, especially the sample of testing  Finding solutions which are highly scalable www.ijacsa.thesai.org