Annotated Corpus with Negation and Speculation in Arabic Review Domain: NSAR

—Negation and speculation detection are critical for Natural Language Processing (NLP) tasks, such as sentiment analysis, information retrieval, and machine translation. This paper presents the first Arabic corpus in the review domain annotated with negation and speculation. The Negation and Speculation Arabic Review (NSAR) corpus consists of 3K randomly selected review sentences from three well-known and benchmarked Arabic corpora. It contains reviews from different categories, including books, hotels, restaurants, and other products written in various Arabic dialects. The negation and speculation keywords have been annotated along with their linguistic scope based on the annotation guidelines reviewed by an expert linguist. The inter-annotator agreement between two independent annotators, Arabic native speakers, is measured using the Cohen’s Kappa coefficients with values of 95 and 80 for negation and speculation, respectively. Furthermore, 29% of this corpus includes at least one negation instance, while only 4% of this corpus contains speculative content. Therefore, the Arabic reviews focus more on negation structures rather than speculation. This corpus will be available for the Arabic research community to handle these critical phenomena 1 .


I. INTRODUCTION
Negation and speculation are commonly used linguistic phenomena, providing information on factuality and the polarity of facts [1]. Negation is a linguistic property shared by all human languages [2], which denotes the absence of something; therefore, negation affects the contextual polarity of words. On the contrary, speculative language is used to convey uncertainty about an event or idea. It means there is not enough evidence in the text to prove whether the information is 100% true. Consequently, sentences including negation or speculation may misclassify the opinionated phrases [3] or inaccurately identifying the medical terms [4], [5]. In order to efficiently identify instances of these phenomena, it is necessary to find those words expressing negation and speculation and then their scope, such as the tokens within the sentence that are affected by these cues [6]. Since negation and speculation are language-dependent, they must be addressed in all-natural languages [7]. Therefore, many studies addressed them to enhance the performance of Natural Language *Corresponding Author 1 https://github.com/amahany/NSAR Processing (NLP) tasks and applications in various languages such as Sentiment Analysis (SA) [8], Machine Translation (MT) [9], and Information Extraction (IE) [5]. These studies addressed the negation and speculation scope detection using rule-based [10] and sophisticated supervised learning methods [11], [12].
Arabic Natural Language Processing (ANLP) has gained unprecedented interest in the age of big data and social media platforms, making it one of the most important research topics, especially in North Africa and the Gulf Area [13]. Classical Arabic (CA), Modern Standard Arabic (MSA), and Dialectal Arabic (DA) are the three primary forms of Arabic [14]. The Qur'an and ancient literature are written in the CA form. The MSA is mainly used in education, the official written reports like newspapers, and formal TV programs. Conversely, the DA includes all current forms of Arabic spoken, written on social media platforms, and reviewed applications and websites where it varies nationally and internationally depending on location [15]. Since the DA has no syntactic rules and multiple forms of the same word, ANLP tasks are challenging.
Negation frequently occurs in the Arabic language and is one of the dominant linguistic methods for changing the text polarity, so negation detection is highly considered in the Arabic Sentiment Analysis (ASA) [3]. However, the presence of negation words in a sentence does not imply that all the sentimental words are inverted. Still, there are odd cases where the presence of negation terms may confirm the polarity of the following lexeme [16]. In the implicit form of negation, a sentence can be negated without using negation words. The level of speculative content increases or decreases the certainty of polarity classification [17]. Few Arabic studies have addressed the impact of negation and speculation using simple rules. Hamouda and El-Taher considered the frequency of negation terms in the ASA task as a classification feature, but the effect of the negation feature on the sentiment classification was not clearly mentioned [18]. In 2015, Duwairi and Alshboul defined six handcrafted rules to handle negation in the Modern Standard Arabic (MSA) texts in the review domain to enhance the performance of the ASA [19]. Even though they addressed the MSA, which follows well-defined rules, the simplistic approach has proven inadequate for a syntactically and morphologically rich language like Arabic. El-Naggar et al. considered several valences to build a negation-aware classifier for ASA in MSA and the Egyptian dialect [20]. Later, Assiri et al. formulated four rules to handle negation in the Saudi dialect [21]. In addition, Kaddoura et al. have proposed a system that inverts the polarity of a sentence's clause if a negation term precedes a positive or negative pattern [3]. Regardless of the improvement in performance in these systems' experimental results [3], [20], [21], none handled the implicit form of negation frequently used in Arabic. Simple rule-based algorithms cannot handle all the negation and speculation cases for the various Arabic language forms and dialects [14]. According to the findings of our earlier work, the treatment of negation scope detection utilizing supervised based learning is promising [12]. To the best of our knowledge, there are no available Arabic corpora annotated with negation or speculation in various domains including the review, newswire and medical domains. Furthermore, speculation detection in ASA has not been studied in any research work.
In the last decade, there has been a growing interest in detecting negation and speculation. Nevertheless, the available open-access corpora for low-resource languages, such as the Arabic language [22], are limited compared to the English and the Spanish languages [7]. Speculation corpora are even more scarce than those for negation, with the majority focusing on the biomedical domain. Since negation and speculation are language-dependent phenomena, the negation-and speculation-aware models from other languages, such as English, cannot be applied to the Arabic text because the syntactic structure of negation in Arabic differs from that in English. Therefore, developing an annotated corpus with negation and speculation for the Arabic review domain is required. It is very important to know that negation-and speculation-aware systems improve the overall systems performance [9], [11].
The rest of the paper is organized as follows: Section II shows the different sources for our corpus. Section III details the annotation guidelines we build for the negation and speculation texts in the Arabic review domain. The annotation process and its result including the agreement analysis of the annotators and the discussion are presented in IV and V. Finally, Section VI concludes the paper and suggests the future work.

II. CORPUS COLLECTION
This section demonstrates the overall characteristics of the Negation and Speculation Arabic Review (NSAR) corpus, as well as a brief description of the texts that compromise it. Furthermore, general statistics are presented regarding each source's size and polarity distribution. The NSAR corpus is comprised of texts extracted from three well-established and benchmarked Arabic review corpora: Large Scale Arabic Book Review (LABR) [23], Large Arabic Multi-domain Resources (LAMR) [24], and Multi-domain Arabic Sentiment Corpus (MASC) [25]. Table I shows the distribution of randomly selected positive and negative sentences from each source, with 2,312 positive reviews accounting for approximately 77% of our corpus. Each topic has a different number of sentences, but the average number of words per sentence is nearly the same. The LABR corpus contains 63K book reviews, with ratings ranging from 1 to 5 stars [23]. Aly and Atiya considered the reviews with 4 or 5 stars with positive polarity and those with 1 or 2 stars with negative polarity. The authors collected these reviews from the best Arabic books listed in the social network for book readers 2 ; hence, most of the randomly selected reviews are positive reviews, as per Table I. The LAMR corpus is the second source for NSAR corpus, and it consists of 33K reviews scrapped via Scrapy framework 3 from various reviewing websites, Souq 4 , TripAdvisor 5 , Elcinema 6 , and Qaym 7 , including reviews for various items and services [24]. Each sentence includes the review text and normalized rating that could be positive, negative, or mixed polarity. The third source, MASC [25], includes 8,860 reviews on different topics such as shopping, restaurants, and software applications written in multiple Arabic dialects. These reviews were obtained primarily from Jeera 8 , Qaym, Google Play, Twitter, and Facebook. The majority of the reviews from LAMR and MASC were composed in Egyptian, and Gulf areas' dialects. On the contrary, most of the LABR samples were written in the MSA form. The review texts in the NSAR corpus are collected from various sources to ensure that it captures the diversity of dialectical language usage in the review domain.

III. ANNOTATION GUIDELINES
Negation and speculation phenomena are interrelated and have similar characteristics: they both have a scope, so they affect the part of the text denoted by the presence of negating or speculative keywords. Furthermore, both of them have two types: implicit and explicit. In the case of the explicit type, the phenomenon cue is written in the sentence, whereas being understood in the case of the implicit one without a cue. Sentences including a negation cue are not necessarily annotated for negation; however, they may have speculative content. Therefore, the annotators should read sentences containing negation cues carefully. In most cases, the keywords influence their scope, aligned from the left to the end of the clause or the sentence.
The following subsections list the general principles, negation, and annotation guidelines. Furthermore, the special or complex cases for both phenomena are demonstrated. In order to illustrate examples in the annotation guidelines, the negating cues are surrounded by a negation symbol (¬), the speculative cues are surrounded by an uncertainty symbol (∓), and their scope boundaries are surrounded by parenthesis.

A. General
When annotating the negation and speculation, several general rules must be followed, which are adapted from the BioScope annotation guidelines [6], then modified to the Arabic language and review domain. Sentences with some instance of negative or speculative language will be only considered. In addition, the min-max strategy should be followed during the annotation. The minimal unit (single word) that expresses the negation or speculation will be marked as a cue. Nevertheless, in some cases, a cue may include more than a single word which is called a complex cue. The maximum number of words affected by a cue will be marked as the scope for negation or speculation. The scope usually starts after the keyword and ends at the end of the phrase, clause, or sentence. However, the scope may include a word or a statement preceding it. The below list summarizes the general rules for both negation and speculation: • A sentence may contain more than one cue instead of only one keyword; in this situation, each cue should be annotated separately.
• Structures of negation and speculation can be annotated in a single sentence.
• The cue is not included in the scope, but it may be included in complex cases and in the scope that includes words preceding and following a cue.
• If a sentence contains a cue that appears at the end of the sentence, the phenomenon's scope is limited to the cue.
• Due to the improper use of spaces in the informal Arabic text, a cue+verb/noun may be concatenated without a space; in this case, the verb/noun will be included in the negation/speculation cue.
• The coordinating conjunctions ‫و‬ (and) extend the scope.
• Annotators will only annotate the cue and leave the scope for the linguist expert if the annotator is unsure about the scope.
• There is an annotation element called the 'undecided' used if the annotator is unsure what type the keyword should be assigned.
Additionally, each type of a negation or speculation structure is depicted with an example where the transliteration and English translation of these examples are listed in Appendix I.

B. Negation Structures
• ‫ﻻ‬ (no) is the most used negating Arabic word, which is used to deny the occurrence of a verb in the past and present tenses, as well as to deny a nominal sentence.
Therefore, the scope begins with the negative cue and ends at the end of the sentence. • ‫ﻻت‬ (no) cue is used only in the classical Arabic form.
• ‫إن‬ (Inn) affects nouns, past, or present verbs. ‫إن‬ will be effective if it gets replaced with another cue and reverses the polarity. There is a distinction between ‫و‬ ‫إن‬ ‫أن‬ (Ann) where ‫أن‬ is not a negating cue. Furthermore, ‫إن‬ ‫إن‬ ‫و‬ may be written ‫ان‬ without ‫ھﻤﺰة‬ (Hamza) in the dialectical Arabic. Therefore, it is necessary to read the sentences carefully to determine the correct form in accordance with the context of the sentence.
• ‫ﻟﻦ‬ (will not) is used with a verb in the present tense to deny something in the future.
ﯾﺸﻚ‬ In addition to, the noun form for some of these verbs like -‫ﻟﻮ‬ indicate speculative content.

13) ‫ﺟﯿﺪة‬ ‫اﺳﻌﺎرة‬ ‫ﺻﺮاﺣﺔ‬ ‫ﻣﺮﯾﺢ‬ ‫ﻓﻨﺪق‬
• Conjunction keywords such as ‫أو‬ (or) have the scope of elements ranging from the right to the left side of the conjunction. However, in instances where the conjunction is composed of two or more words like ‫,أو‬ ‫إﻣﺎ‬ (Or), ‫ﺳﻮاء‬ (Whether), the scope does not change.
• If the speculation cue is present at the start of the sentence, then the scope extends to include the whole sentence.

D. Negation Complex Cases
The presence of a negation keyword does not automatically negate a sentence as follows: • For example, ‫إن‬ that assures something. 16 In some other cases, the negation is implied in the sentence without any negating cue while understood from the context of the text.
• The sentence implies denial without any negative cues such as

E. Speculation Complex Cases
Certain speculation cases are marked using few keywords.

29)
• In some cases, speculation cues may be used to imply an affirmation.

IV. NSAR ANNOTATION
This section describes the procedure followed in the annotation process of the NSAR corpus. Initially, the guidelines are created based on the negation rules of the formal Arabic language in addition to the commonly used slang negating cues in the Egyptian and Gulf countries' dialects. Then, a list of Arabic keywords for the speculation is built which would indicate speculative content, and subsequently, these rules are applied to annotate a sample of the corpus and extract any additional cases from the corpus to enhance these rules for the annotation process.
There is a need for a tool for the annotation process to build and develop NSAR corpus. There are many available annotation tools for this purpose. Based on an evaluation of the well-known annotation tools in this study [26], WebAnno 9 is selected, which achieved the highest score [27]. WebAnno is an open-source web-based annotation tool that provides full functionality for both semantic and syntactic annotations. Furthermore, it supports adding user-defined annotation layers as we did for the negation and speculation. The user-defined layers are only supported in TSV3 format, where there is an open-source Python library to extract the annotations written in TSV 10 . As in Section II, NSAR corpus is collected from three different Arabic corpora from the review domain labeled as positive or negative and written in CSV file format. Therefore, we transformed the input files from CSV to TSV file format. Five user-defined labels associated with the WebAnno project: sentiment, negation, speculation, bad, and undecided are created. The sentiment has one feature called 'polarity' with 'negative' or 'positive' values, used with the transformation from CSV to TSV for the sentiment labeling. For the negation and speculation labels, every label has a tag set with two different values 'cue' and 'scope' which are associated to each other using two user-defined relations 'NegRel' and 'SpecRel'. The other two labels 'bad' and 'undecided' are used to highlight any inappropriate or hateful content in the text or the annotator cannot take a decision about a sentence.
The annotation process was implemented in three phases: the first phase was to describe the annotation guidelines and train the annotators on using WebAnno, then the annotators carried out the annotation to measure the inter-annotator agreement (IAA), and finally, a linguist expert resolved the disagreements between them. Two independent Arabic native speakers carried out this process; one is an experienced annotator with a solid background, and the second is a welltrained person. Each file has been annotated by both annotators.

V. RESULTS AND DISCUSSION
In this section, we explore the result of the annotation process. The Cohen's Kappa coefficient [28] is used to measure the quality of the annotation process. Cohen's Kappa of value 0.95 for the negation and 0.8 for speculation are obtained. These values demonstrate that the speculation annotation is more complex than the negation in Arabic. Table  II shows the NSAR corpus, which includes 862 negated 9 https://webanno.github.io/webanno/ 10 https://github.com/neuged/webanno_tsv sentences out of 3,011, and only 121 sentences containing at least one speculative content.
The disagreements between the two annotators were revised by a linguist expert [6]. The majority of disagreement cases in negation are caused by common human errors, such as one of the annotators forgetting to relate the negation cue to its scope using the relation layer. Since a single sentence may contain multiple negation structures [29], this layer is added and should be specified for each annotation. The speculation cases, on the contrary, are ambiguous and may lead the annotator to consider it a negation or speculation [7]. Therefore, it had a higher level of disagreement than the negation. These cases involve an issue within the scope of speculation, such as the non-inclusion of a word. In addition to the undecided label, the disagreements have been curated by the first author and the linguist expert. Table II shows that 29% and 4% of total sentences have at least negation and speculation structures, respectively; however, these percentages vary from topic to topic. For instance, MASC sub-corpus includes high rates of negating and speculative content. The subject types in Arabic sentences change the form of most Arabic words, such as verbs ‫ذھﺐ‬ (He went) and ‫ذھﺒﺖ‬ (She went). There are other various forms of negation in Arabic that have the same meaning in English. This example shows the negation difference between the MSA and Egyptian dialect where ‫ﻣﻠﻜﺸﻰ‬ in the Egyptian dialect is derived from ‫ﻻ‬ ‫ﻟﻚ‬ ‫ﺷﻰء‬ or ‫ﺷﻰء‬ ‫ﻟﻚ‬ ‫ﻟﯿﺲ‬ in MSA form, where all of them means (you do not own anything). Another example, ‫ﻣﻜﻨﺘﺶ‬ in the Egyptian dialect, which is derived from ‫ﺗﻜﻦ‬ ‫ﻟﻢ‬ or ‫أﻛﻦ‬ ‫ﻟﻢ‬ in MSA, means (I do not + verb) or (She does not + verb) according to the context. However, removing a single character from this word as ‫ﻣﻜﻨﺶ‬ will change the meaning to be (He does not + verb). These examples demonstrate the complexity of negation in Arabic, especially in the dialect Arabic. Furthermore, the spelling rules are not followed in dialectical Arabic, resulting in tokenization issues such as in ‫اﻟﻜﺘﺎﺑﺔﻻﺗﻈﮭﺮ‬ (The written text does not appear) [3]. There is no space between the three words that should formally be used. Other instances in the dialect of Arabic include different forms for the same Arabic word with the same meaning as in ‫ﻣﺎﻓﯿﺶ‬ and ‫ﻣﻔﯿﺶ‬ (None-existence). Therefore, we normalized the commonly used negation and 42 | P a g e www.ijacsa.thesai.org speculation cues, as depicted in Table III and Table IV. The Negator ‫ﻻ‬ and speculative cue ‫ﻟﻮ‬ account for approximately 45% of the negation and speculation cues, respectively.
Table V displays the average, minimum, and maximum scope lengths for both negation and speculation for each topic. For the negation scope, the minimum and average scope lengths are nearly identical, but there is a notable variation in the maximum scope length for each topic. This notice in books and software topics usually negate the longest part of the sentence. Table V also shows that the speculated words within a sentence are longer than the negated words because the speculation structures usually affect the whole sentence, as described in the annotation guidelines.  Table VI presents the distribution of negated and speculated sentences based on the overall polarity of the sentence. On average, the number of sentences with negation structures and positive polarity is the same as negative polarity. Nonetheless, the number of negation cases in the software topic with negative polarity is more than the cases with positive polarity. In addition, the speculative contents within positive polarity account for 66% of the corpus speculation cases as it is the majority in the books and software topics. According to our observation, the book's topic includes most negation and speculation cases, which are typically used to cancel something negative about the books. Furthermore, most of the software advantages or features are negated or speculated. Fig. 1 and Fig. 2 demonstrate the number of negation cases in each sentence within the three sub-corpora. The number of negated sentences that include more than two negation scopes in one sentence is 173, accounting for 20% of the negation cases in the NSAR corpus. However, there are only three sentences with two speculation scopes. This finding further proves that the speculative content in the review domain includes the entire sentence as long as the polarity.

VI. CONCLUSION AND FUTURE WORK
The DA texts are used in people's day-to-day conversations on social media platforms and review websites. Many research groups worked on the sentiment analysis task, and some of them considered the negation linguistic feature and highlighted its significance using simple rules. However, researchers still have challenges in addressing various structures of the negation phenomenon as long as the speculation. This paper presented the first Arabic corpus in the review domain annotated with negation and speculation (NSAR) to tackle these challenges using supervised learning techniques. This corpus was annotated by two Arabic native speakers who adhered to strict annotation guidelines that were reviewed by a linguist expert. The Cohen's Kappa coefficients were used to measure annotator agreement and obtained 95 and 80 for negation and speculation, respectively. The results show that the annotation guidelines were written clearly. NSAR will be made available, which will contribute to the detection of negation and speculation, as well as the sentiment analysis task. The future work includes extending the corpus by annotating the events element as long as the negation focus. In addition, we plan to apply the recent deep learning techniques on this corpus to study the impact of negation and speculation on various ANLP tasks.