Covid-19 Ontology Engineering-Knowledge Modeling of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)

COVID-19 pandemic has rapidly spread across the world since its arrival in December 2019 from Wuhan, China. This pandemic has disrupted the health of the citizens in such a way that the impact is enormous in terms of economy and social aspects. Education, employment, income, well-being of the humankind is affected very crucially by this corona virus. Nations world-wide are struggling to battle this emergency. Intensive studies are being carried out to control this pandemic by researchers all over the world. Medical science has advanced a lot with the application of computer assisted solutions in health care. Ontology based clinical decision support systems (CDSS) assist medical practitioners in the diagnosis and treatment of diseases. They are well known in data sharing, interoperability, knowledge reuse, and decision support. This research article presents the development of ontology for SARS-CoV-2 (COVID19) to be used in a CDSS, which is proposed in the satellite clinics of Royal Oman Police (ROP), Sultanate of Oman. The key concepts and the concept relationships of COVID-19 is represented using an ontology. Semantic Web Rule Language (SWRL) is used to model the rules related to the initial diagnosis of the patient and Semantic Query Enhanced Web Rule Language (SQWRL) is used to retrieve the data stored in the ontology. The developed ontology successfully classified the patients into one of the different categories as non-suspected, suspected, probable, and confirmed. The reasoning time and the query execution time is found to be optimal. Keywords—COVID-19; ontology; SARS-CoV-2; ontology reasoning; SWRL; SQWRL


I. INTRODUCTION
Around the world, the crisis of coronavirus disease is growing unprecedently. Reported from Wuhan, the capital city of the Hubei Province in China in December 2019, World Health Organization (WHO) declared this as a pandemic on 11th March 2020 [1]. The virus which cause this disease is termed as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). As of November 17th, 2020, the number of active cases in the world are 15,526,034 and the number of deaths is 1,335,263 [2]. Due to the coronavirus, countries have issued travel bans. Many countries have implemented 14-21 days of complete lockdown. The sad reality is that people are locked down in houses and are filled with anxiety, stress and depression. People are afraid to go out. Relevant authorities of almost all countries provide necessary support and guidance to all the cases, but still the fear conquers the world.
Physicians and researchers from all over the world are conducting intensive studies to know about this novel virus and its pathogenesis, to find epidemiological factors, to explore different candidate vaccines, to investigate the effect of therapies, to perform randomized trials etc. to come up with effective drugs to handle this global pandemic. Countries are competing among themselves to become the first to develop effective drug to control this situation. Every day, news agencies are reporting about the success stories of testing vaccines to control this virus. Till this date, no approved therapies are known for this virus.
Ministry of Health (MOH), Sultanate of Oman reported the first cases of corona virus in February 24, 2020 [16]. Now with nine months completed, 7922 active cases exist, and 1350 deaths occurred as of November 17 th , 2020 in Oman [3]. Country witnessed several lockdowns as part of the effective control measures to prevent the spread of the virus. MOH being the main agency to oversee the health sector in Oman, Royal Oman Police (ROP) hospital is also at the front-line to serve the citizens of the country. ROP operates several satellite clinics in the interior regions of Oman. These centres may not have expert doctors to diagnose such diseases. Also, in some of these centres, there are not enough facilities to diagnose coronavirus diseases. Because of the fear of community spread, people are unable to travel to hospitals in the capital city to seek expert care and support. So, to support the health care workers of satellite clinics of ROP in diagnosing COVID-19, we have proposed a knowledge based clinical decision support system (CDSS).
The objective of this research paper is to outline the design of COVID-19 ontology which will be used to represent the *Corresponding Author 117 | P a g e www.ijacsa.thesai.org knowledge base of the proposed decision support system. The ontology covers the symptoms, risk factors, epidemiological factors, initial diagnosis, lab tests, clinical diagnosis, recommendations and treatment in the context of Sultanate of Oman. National Clinical Management Protocol for Hospitalized Patients with COVID-19, ICU protocol for management of COVID-19 published by Ministry of Health, Sultanate of Oman is used as the clinical guidelines for this research [13]. We created the knowledge base as per the above guidelines currently followed in Oman. The development details of CDSS is outside the scope of this article.
The rest of the paper is outlined as follows: -Background is given in Section 2 followed by the Methodology. The methodology section includes the modelling of the ontology and the construction of rules and queries. Results are given in Section 4 followed by Discussion in Section 5. The performance of the developed ontology is shown in Section 6 and the Conclusion & Future is presented in Section 7.
II. BACKGROUND Semantic web applications interpret not just the content and structure of any presentation, rather it can understand the meaning of the content. Machines conduct automatic reasoning to accomplish the task of interpreting the textual meaning. Concepts must be structured and accompanied by a set of inference rules to perform logical reasoning [4]. Technologies in semantic web includes Extensible Markup Language (XML), Resource Description Framework (RDF), Ontology and many more. These technologies ensure to connect the data, rather than the connection between documents through links. Instead of URLs between documents, semantic web implements URLs between facts. RDF express any sentence in the form of triplets (subject, predicate and object) [4]. Moreover, a Universal Resource Identifier (URI) is used to identify every subject and object [4]. XML doesn't support semantics [4]. Instead, it is used to represent the RDF triplets [4]. RDF use RDF Schema (RDFS) to describe the terms used in a sentence [4].
In 1993, Tom Gruber proposed the definition of ontology [5] as "An ontology is a formal, explicit specification of a conceptualization". Ontology defines an abstract model about a particular domain, explains precisely the terms in a domain and expresses the relationships between the terms thus making it explicit, it is machine understandable, so it is formal, and it is accepted by a group and shared. The primary aim of ontology is to ease knowledge sharing and reuse [4]. The integration of RDF and the underlying ontology makes the meaning understandable to the machines [4]. Web Ontology Language (OWL) extends RDFS and add features such as union, intersection, cardinalities, reasoning and inferring capabilities [4]. Description Logics (DL) are used to represent the semantics of OWL [6].
In the case of epidemic outbreaks, action research by integrating different disciplines such as medicine, computer science, sociology, psychology etc. is mandatory. Such research requires the integration of data from varied sources [7]. Ontologies are well known among biomedical researchers for data sharing, integration and reuse. Ontologies are used to represent any domain knowledge in a formal way. Biomedical researchers use ontology to include the concepts (entities) in the biomedical domain and to express their connections (relationships). The relationships are semantic in nature, and accurately represent the domain terms and its relationships. Ontology plays an important role in knowledge sharing and reuses and it supports automatic reasoning. They are used in clinical decision support systems to support health care workers.
Since its inception as the foundation of Artificial Intelligence (AI), ontologies are used in different disciplines for knowledge sharing. Among the successful ontologies in life sciences include Gene Ontology [8], Disease Ontology [9], SNOMED-CT [10], ICD-10 [11] etc. Fig. 1 illustrates the entity-relationship diagram of the proposed system. Patient entity has a relationship with Symptom, Risk Factor, Epidemiology, Diagnosis, Test, Clinical Diagnosis, Recommendation and Treatment entities. Based on the Symptoms, Risk Factors, and Epidemiology, the patient will be first categorized into one of the cases -nonsuspected, suspected, probable, and confirmed. Then based on the Clinical Diagnosis and Test reports, suitable Recommendation and Treatment will be provided to the Patient. The relationship between Patient and Diagnosis and Patient and Clinical Diagnosis is 1:1 as the Patient will be categorized into one of the states as mentioned above. Relationship of Patient entity with all other entities are M:N.

B. Modelling of Domain Knowledge
The process of modeling the domain using ontologies starts with defining the top concept of the domain. It is usually represented by owl:Thing. Then the concept classes in the domain is defined as = � 1, 2 , … . . , � [17]. For each of the concept class , sub-classes are also defined as = { 1 , 2 , … . , } to form the concept hierarchy [12].
The key knowledge required to build our knowledge base is to understand about the symptoms, diagnosis, available treatments etc. of SARS-CoV-2. The important concepts are represented as classes in the ontology. Patient, Diagnosis, Symptom, Background_history, LabTest, RiskFactor, ClinicalDiagnosis, Recommendation, and Treatment are the main concepts.  118 | P a g e www.ijacsa.thesai.org All the patient cases will be represented as instances of Patient class. To represent the different symptoms of SARS-CoV-2, Symptom class is created. All the reported symptoms as per the Oman and WHO guidelines [13,14] are represented as subclasses of this class. Epidemiological link of the case is represented using the Background_history class through various sub-classes. RiskFactor class includes subclasses to represent the important risk factors of the case (comorbidities, age greater than 60, consumption of immunosuppressive drugs, etc.).
The Diagnosis class is used to categorize a case into one of the different categories non-suspected, suspected, probable, and confirmed. All these categories form the subclasses of Diagnosis class. Classification of a case into one of the above categories is done after the reasoning of the ontology. ClinicalDiagnosis class is used to represent the cases into different categories such as mild, moderate, severe and critical based upon the clinical conditions. LabTest class include different investigations (mandatory tests and additional tests) to be performed, depending on the case. Recommendation class is used to provide recommendations to the users. The treatment suggestions as per the situation of each case is represented in Treatment class.
Object properties and data properties are shown in Fig. 3 and Fig. 4.

C. Semantic Web Rule Language and Semantic Query Enhanced Web Rule Language
We have used here Semantic Web Rule Language (SWRL) to construct the rules. It is an OWL based language which is expressive in nature and adds power to Description Logic (DL). SWRL is a combination of DL and First Order Logic (FOL). The antecedent-consequent way of expressing a rule allows SWRL to link with relational databases. The rules written in SWRL provide more reasoning ability than OWL.
The body part of the rule is referred as antecedent and the head part is referred as the consequent. It takes the form, The rule is basically a combination of conjunction of atoms where each atom is expressed in the form of ( 1, 2, … . , ) The concepts in OWL such as classes, properties, data types, ranges and built-in functions are used as SWRL predicates. Arguments include the values of the properties or OWL instances.
Semantic Query Enhanced Web Rule Language (SQWRL) is used to extract information from the ontology [15]. This query language supports SQL like operators.
It is used along with SWRL to extract the information retrieved by SWRL rules [15]. We have also used SQWRL collections to construct queries which involves disjunction, which otherwise is not possible using SWRL. The left side of SQWRL takes the form like SWRL antecedent and retrieve the data in the consequent part in the right side of the query. sqwrl:select is the primary operator used in SQWRL. Here, the SWRL inference rules are executed initially followed by the execution of queries.

D. Construction of Rules and Queries
The proposed CDSS will have a separate user interface for patient (online) and several other interfaces for health workers. So, accordingly we divide the working of this ontology into two parts. The first part will be used by Module 1 (patient interface) of our proposed CDSS. OWL classes such as Background_history, RiskFactor, Symptoms, Diagnosis and Recommendation are used in this part. Module 1 provides the necessary recommendations to the patient from the home itself regarding his/her present health condition related to COVID-19, based on the initial diagnosis. Module 2 of the proposed CDSS will use the concepts related to Clinical Diagnosis, Lab Test, and Treatment. It provides the required suggestions and recommendations to the medical practioners based on the results of clinical diagnosis and lab test values. In this article, we focus on the initial diagnosis part of the ontology.

1) Rules:
Rules are constructed to initially diagnose different patient cases into one of the categories of Diagnosis class: Nonsuspected, Suspected-Asymptomatic, Suspected-Symptomatic, Probable and Confirmed (Fig. 5). SWRL rules related to the initial diagnosis is presented in Table I. Case#1 represents cases that doesn't have any symptoms and background history yet. The corresponding rule is used to diagnose such non-suspected cases. Antecedent part of Rule 1 consists of four atoms. Each atom is expressed in the form of predicate with arguments. The first predicate Patient (?p) is used to retrieve all the patients from the ontology and store the value in variable 'p'. Next predicate has_symptom (?p,No_symptom) checks whether the patient has no symptoms. No_symptom is an individual of the same class. The predicat is_recommended_with(?p,Nonsuspected_ case_recommendation) fetches the recommendation suggested by the system to such non-suspected patients. Also, such patients are automatically categorized as instances (individuals) of Nonsuspected_case class. This is the functionality of the predicate Nonsuspected_case(? p) . When the patient is categorized as a non-suspected case, a concerned SWRL rule will provide the recommendation through a data property hasRecommendation.
The cases with no symptoms, but suspected history of travel to infected areas/contact with COVID-19 cases are classified as a Suspected_Asymptomatic case (Case#2). If the predicates has_symptom(?p,No_symptom), hasContactWith Covid19Patient (?p,true), and hasTravelledToCovid19Area (?p,false) returns true, then such patients are automatically categorized as instances (individuals) of Asymptomatic class. Also, the corresponding diagnosis and the recommendation is also provided.  Patients with upper respiratory tract viral infection reports non-specific symptoms such as chills, fatigue, cough, muscle pain etc. which are uncomplicated. As per the guidelines of WHO and Oman, it is suggested not to have immediate hospital referrals for such cases. System advice such patients to have home quarantine and in case if the symptom worsens, it suggests taking an appointment in the hospital to do PCR swab test. For example, in Table I, Case#3 represents a suspected symptomatic case and the given SWRL rule checks the patients with the mild symptoms reported. As SWRL doesn't support disjunction of atoms, we have written separate rules to check the occurrence of every symptom. We didn't use the OWL class union operator which supports disjunction. All the mild symptoms are included in the ontology to check all possible symptomatic cases.
As per the MOH guidelines, a person suffering from high fever, shortness of breath, and chest discomfort should be immediately reported to the hospital for admission and for further treatments. These cases are considered as probable cases of COVID-19. An SWRL rule regarding the probable 120 | P a g e www.ijacsa.thesai.org case is given in Table I (Case#4). The probable cases of COVID-19 are immediately suggested to do RTPCR test. Case#5 shows the SWRL rule regarding the confirmed case. Table II represents the recommendations suggested by the system. For each case, the corresponding SWRL rules are added to the ontology.
2) Querying the ontology: After constructing SWRL rules to diagnose different patient cases, SQWRL is used to construct queries to retrieve the relevant data. For example, the execution of the SWRL rule related to Case#5 in Table I will classify all instances of confirmed cases under _ class of the ontology. The data related to all such patients can be retrieved using the following SQWRL query in Table III. ℎ , ℎ , ℎ and ℎ are the data properties and the operator is used to retrieve this information of patients confirmed with COVID-19. Similarly, we constructed several queries to retrieve information of patients with one or more risk factors, symptoms, etc.

Non-suspected case
Your inputs suggest that at present you are safe. In case any new symptoms develop, revisit our Symptom Checker to get recommendations.

Suspected Asymptomatic case
Recommended home quarantine for 14 days. In case any new symptoms develop, revisit our Symptom Checker to get recommendations

Suspected Symptomatic case
Recommended home quarantine for 14 days. In between, if symptoms develop, take an appointment in the nearest hospital and do PCR swab test.

Probable case
Immediately proceed to the nearest hospital and do PCR swab test.

Confirmed case
Strict home quarantine is advised, if symptoms develop, immediately proceed to the emergency.

IV. IMPLEMENTATION AND RESULTS
This section describes the implementation and results of the semantics of the knowledge base. Protégé 5.5.0 is used to implement the ontology [16]. Reasoners play a critical role in interpreting the semantics of ontologies and instances. The explicit facts are directly asserted in the knowledge base through properties. The semantics of these facts are interpreted by the inference mechanism of the knowledge bae. Reasoners perform the inference and extracts additional information from the knowledge base. Here rule-based reasoning is used in which the reasoner interprets the logical rules along with the asserted facts in the knowledge base to extract new information.
Forward chaining inference method is used here to add all the implied facts to the knowledge base. Reasoner do the reasoning in a forward-fashion, by considering the facts and the rules in the knowledge base and infer the new facts they imply. Whenever a new fact is inferred, it may lead to the inference of other facts. Pellet reasoner is used to reason the knowledge base. Fig. 6 shows the inference results based on SWRL rules given in Table I and Fig. 7 display the results of the query given in Table III

V. DISCUSSION
This section discusses the results of the reasoning process to infer the recommendations for different types of patient cases. The necessary conditions to infer the patient cases were written using SWRL as explained in the previous sub-sections. As shown in Fig. 8, Patient2 instance is created and the values of different properties (object and data) are asserted.
As per the SWRL rules given in Table I, the reasoner infers two facts initially for this patient instance. First fact is the diagnosis of this case as a non-suspected one. As a result, the instance is automatically inferred under the Non_suspected_case class using the object property is_diagnosed_with [ Fig. 6a]. Next one is the corresponding recommendation for these types of cases. The property is_recommended_with is used to assign the recommendation [ Fig. 6b]. The addition of the above inferred fact to the knowledge base leads to the automatic inferring of another fact, the final recommendation given as the value of has Recommendation property [ Fig. 6b].
Similarly, the inference results of suspected asymptomatic cases are shown in Fig. 6(c) and (d). The instance Patient5 is automatically inferred under the Asymptomatic subclass. The inference results of suspected symptomatic cases are shown in Fig. 6(e) and (f). The instance Patient1 is automatically inferred under the Symptomatic subclass. Fig. 6(g) and (h) shows the inference results of probable cases. The instance Patient3 is automatically inferred under the Probable class. The confirmed case inference is shown in Fig. 6(i) and (j). The instance Patient12 is automatically inferred under the Confirmed class. The final recommendations are also shown in each of the above cases.

VI. PERFORMANCE METRICS
In this section, we explain the performance of the ontology in terms of reasoning time and the execution time of queries. The reasoning task of classifying the ontology was done using Pellet reasoner. Six SQWRL queries corresponding to different category of patients were executed on a machine of configuration 8GB RAM and i7 processor. Fig. 9 shows the execution time of different queries in Protégé. The developed ontology was loaded with some input data. OWL 2 DL reasoner, Pellet was then used to reason the ontology. The ontology was processed by Pellet in 77 ms. Then from SQWRLTab, each of the six queries were selected and run. The total execution time of the six queries was 6488 ms and the average execution time was calculated as 1297.6 ms. This shows that the reasoning time and the query execution time is optimal.

VII. CONCLUSION AND FUTURE
Bio-medical ontologies play a significant role in the design and development of CDSS. In this paper, we have presented the development of an ontology (classes, properties, rules etc.) to represent the domain concepts of SARS-CoV-2 (COVID-19). We have presented the initial part of the ontology in this article, which categorize a person into any one of the categories -unsuspected, symptomatic, asymptomatic, probable and confirmed. SWRL rules are constructed to check the necessary condition for the classification and categorization of patients into the above categories. We constructed several queries using SQWRL to retrieve the information stored in the ontology. The intended ontology will be used in a CDSS, which is currently under development to support the medical practioners of satellite clinics of ROP, in diagnosing COVID-19 in Sultanate of Oman. The future work will be of two-fold: (a) to present the full ontological concepts related to Clinical Diagnosis, Lab Test, and Treatment and (b) the design and development of the above mentioned CDSS.