Feature-Based Sentiment Analysis for Arabic Language

In light of the spread of e-commerce and emarketing, and the presence of a huge number of reviews and texts written by people to share views on products, it became necessary to give attention to extracting these opinions automatically and analyzing the feelings of the reviewers. The goal is to obtain reports evaluating products and contribute to improve services at a glance. Sentiment Analysis is a relatively recent study that deals with the processing of natural texts published in web sites and social networks. However, the processing of texts written in the Arabic language is one of the challenges that specialists face because people do not rely on standard Arabic, writing people in spoken/colloquial languages and use various dialects. This paper will present feature-based sentiment analysis for Arabic language which works on text analysis technique that breaks down text into aspects (attributes or components of a product or service), and then allocates each one a sentiment level (positive, negative or neutral). Keywords—Sentiment analysis; feature-based; colloquial Arabic; opinion mining; natural language processing


I. INTRODUCTION
Sentiment analysis is an active research area since 2003 [1] and, it refers to the process of mining the texts in order to identify the tone of the passage written by the reviewers [2]. These tones are the focus for the decision makers to assess customer satisfaction with their products, which have been categorized into different poles. The most significant polarizations were absolutely in many studies, such as [3], [4] and others were usually three tones: positive, negative, and neutral. Sentiment analysis, which is also called opinion mining, is the computational study of people's opinions, sentiments, and attitudes about topics, entities, people and events, that are expressed in texts [5].
Recently the number of internet users has increased significantly in the Middle East and people are becoming more and more interested in buying online. According to new statistics [7] which have resulted that the number of internet users in the Arab countries has reached 157 million people, according to the Arabic Network for Human Rights Information. Internet buyers are distributed in the Middle East in several countries, reaching 10.6 million in Saudi Arabia, 6.8 million in the UAE, 2.4 million in Kuwait and 15.2 million in Egypt and around other Arab countries at different rates. The mobile phone is also the best-selling product online in the Arab world, according to the director of Souq. (Source: payfort) [7].
The Arabic language is one of the fastest growing languages on the web [6]. The main challenge in this study that sentiment analysis is for Arabic which is considered a poor area for this language. In addition to the peculiarity of the Arabic language whether in the Standard Arabic or in terms of the diversity of its dialects. The Arabic language is a Sematic language which consists of 28 letters. It is a cursive language, in which word formation consists of connecting letters to each other. As opposed to the English language. Arabic writing starts from right to left and has no capitalization [6].
Human can easily read texts and recognize reviewer's sentiment by understanding context, but for computers it is not normal process. Therefore, the main task in this study is to make computers recognizing the reviewer's sentiment and this achieved by Natural Language Processing (NLP). NLP is a framework to support an interaction between computers and human languages [8].
In this paper, based on the market need in the Arab world, and in light of the lack of Arab studies in this field with the wide spread of Arabic texts on the web written in various nonstandard Arabic dialects, it was necessary to fill the gap and present a theory in this field. Since mobiles are the best-selling products, they will be the focus of this study. This theory exhibits a proposed method for recognizing Arabic sentiment phrases for mobile phones with consideration of each feature of phone like: camera, battery, memory … etc. The opinion phrases identified by building grammatical analyzer which is defining several forms for these phrases. Grammatical analyzer needs a lexical analyzer as input to define opinion tokens. opinion tokens could be mobile features, entities names and opinion words. Which could be positive names, negatives names, positive verbs, negative verbs, positive adjectives, negative adjectives, modifiers and negation words. This process called Parts of Speech Tagging (POS) that will be presented in this study. POS tagging has been used for a long time in text classification and NLP. POS tagging differentiates syntactic meaning of words in a sentence by using some specific tags, such as tags for noun, pronoun, verb, adjective, adverb, conjunction and others [8].
Also, after identifying opinion phrases the study will classify the opinion into five polarities in range [-2, 2]: {Strong positive, positive, neutral, negative, strong negative}. Finally, the summarization is necessary in order for decisionmakers to gain knowledge. www.ijacsa.thesai.org The rest of this paper is organized as follows. Section 2 overviews related work. Section 3 describes the methodology followed with examples showing exactly how the study could achieve the goal. Section 4 presents the results of the experimental analysis and evaluation. Finally, in section 5, conclusions and possible future work are discussed.

II. RELATED WORKS
This section exhibits a number of related previous studies as this paper adopts some of their approaches and overcome the absence of some points for Arabic in others.
Bing Liu [9] is one of the most famous studies that cited by most researches in this field. He used rules for recognizing opinions and made for them 1 Backus-Naur Form (BNF). BNF is a meta syntax notation for context-free grammars, often used to describe the syntax of languages used in computing. This study adopted his approach for Arabic language. In Mohammad N. et al. [1] recognized opinions using lexicon, they concerned in Modern Standard Arabic (MSA) and colloquial for example: "Khaliji". In addition to Chetashri B. et al. [10] discussed the lexical and machine learning approach. Mongkol Seansuk et. al. [11] exhibited traditional methodology and evaluated opinions logically for each sentence; they considered opinion is positive by comparing sentences and the result will be positive only if both of them are positive, else negative. Asad Ullah R. K. et al. [12] retrieved comments from YouTube to analysis sentiment about Android and iOS; they used General Architecture for Text Engineering ( 2 GATE) component and build plugin. GATE is necessary component for NLP, as this paper used it to achieve multiple ideas. Weishu Hu. et al. [13] presented how to mine product features in opinion sentences. It made use of SentiWordNet based algorithm to find opinion of the sentence. Samir A et al. [14] presented a novel solution for Arabic Named Entity Recognition (ANER) problem, which aimed to boost the identification of extracted named entities. They utilized a machine learning technique using pattern recognition to classify name entities (NE).
Sana A. et al. [6] proposed study for Twitter sentiment analysis model that based on supervised machine learning and semantic analysis. They are divided their approach to two phases training and testing, in the training phase, they needed to learn from a set of labeled tweets for classifier. Then they used to classify unlabeled tweets in the testing phase. Mohamad H. et al. [16] also focused on studying sentiment analysis for Arabic language that collected from Twitter, Facebook and YouTube. Taysir .H et al. [15] focused on mining social networks for sentiment analysis of colloquial Arabic comments. The approach concerned with Egyptian 1 https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form 2 https://gate.ac.uk/ terminology as it provided a structure to define the standard meaning of the word and the informal terms associated with this word. Alaa El-Dine A. H. et al. [17] also used classification methods to analyze users' comments and detected the comments that agree, disagree or is natural with respect to a post. The data collected from Facebook. Sawsan C. et al. [18] adopted in their approach ontology for detecting Arabic Emotion. They detected language or dialect that belonged to with the help of GATE. They arrange the emotional vocabularies into intensities belonging to the integer numerical domain [-10, +10]. Whereas other studies detected specific dialect of Arabic language like Abdullah D. et al. [19]. Arabic Levantine tweets are a corpus of the study, they implemented different methods to automatically classify text messages of individuals to infer their emotional states.
Abdul-Mageed et al. [20] presented a subjectivity and sentiment analysis system (SAMAR) based on a Support Vector Machine (SVM) classifier for different Arabic social media applications: Web forums, chat, Wikipedia Talk Pages, and Twitter. They studied different features including word ngrams, POS tagging, and word stems. Also, many stylistic features related to social media applications were investigated. The results showed that the classifier performance relied on the type of the dataset and feature used.

III. METHODOLOGY
This section presents method for feature-based sentiment analysis for Arabic language. Mobile phone is the target product. Therefore, the study exhibits analyzing people's sentiment for mobile phones for each feature. As well as it presents entity recognition for mobile names. The study consists of the process as shown in Fig. 1.

A. Dictionaries
This approach includes three dictionaries for features, sentiment words and entities. The data collected as a sample for training data. With the possibility of feeding these dictionaries later dynamically with flexibility, prior experience is not required. The dictionaries data collection source details in pre-processing section.

1) Features dictionary:
Features are domain-based of sentiment analyzer and in this study the domain is mobile phones. This dictionary composes 84 words as a sample data and it is scalable. E.g. for mobile features: camera; ‫,"كاميرا"‬ memory; ‫,"ذاكرة"‬ battery; ‫"البطارية"‬ ...etc.
2) Sentiment words dictionary: Sentiment words contribute to the quality of sentiment classifier. They are domain-independent unlike the features, but they are related to the terminology of the Arabic language in all its dialects. They were collected by relying on experiments from people's reviews, and space was also allowed for scalable.
Sentiment words are classified into several categories: five positive categories, five negative categories, negations category, and strong words (or modifiers) category.
The structure of sentiment words is shown in Table I Also, the same word may have several spellings in Arabic, that the stemmer unable to stem them because that are not Arabic and have no meaning.
Therefore, this approach defines specific structure for mobile names as a hierarchy. Each level has multiple keywords to include all different spellings for the same name.  This dictionary includes three mobile categories (brands), each category has several kinds and each kind has several versions. Each level has keywords list for different forms of the same word.
B. Pre-processing 1) Data collection: As sentiment analysis depends on the training data which labelled. The 85 posts of mobile phones are collected as a dataset, they include 1024 comments, which include 570 replies obtaining from mobile pages; 3 souq.com and 4 mobihall.com pages on Facebook. Most of posts are advertisement about mobiles therefore, the comments and replies are the target reviews.
2) Reviews structure and format: Since the reviews has been collected from different sources, the standard structure www.ijacsa.thesai.org became required. The appropriate format chosen to represent reviews is the "eXtensible Markup Language" (XML). The reviews include ratings as likes or stars, created date time and review id. Fig. 3 shows sample of reviews data. As well as the dictionaries data that collected by reliance on the same sites, they are not only for reviews that represent dataset but also for all reviews of whole sites pages.

3) Arabic stemmers: This approach needs two stemmers for reviews and dictionaries lists:
a) Light stemmer is built-in by this study for noise elimination or normalization:  Standardize Hamza ‫."أ"‬  Eliminate Tashkeel ً, ً, ً , etc.
b) Advance stemmer to extract root words. "Khoja" and "Arnlp" are the most famous stemmers for Arabic language. This approch used "Arnlp" because in addition to finding the root word, it works to find the stem word. The stem word may be more meaningful and reduce the confusion that occurs due to the presnce of one root for opposing words.

C. Sentiment Analyzer
The approach achieves natural language processing with GATE component. The process of sentiment analyzing consists of these steps:

1) Opinion tokens detection:
The detection of opinion tokens is considered as the lexicon in this study. Opinion tokens include dictionaries lists; features, sentiment words and entities. The opinion tokens detection implemented using GATE Gazetteer. The GATE Gazetteer matches words in lists with the possibility of annotating each matched word. These annotations are very useful data for next steps. Therefore, Gazetteer includes following lists: a) Features list includes: feature, feature identifier. b) Sentiment words list includes: sentiment word, sentiment category, polarity, sentiment id.. c) Entities list for all keywords of mobiles names in one list, includes: keyword of mobile name, full mobile name, product id, level name. Fig. 4 shows opinion tokens detection example. The lexical analyzer detects opinion tokens and classifies them based on its semantic meaning as kind of POS tagging.

2) Opinion phrases detection:
In this stage, the grammar analyzer is identifying opinion phrases syntax. The opinion phrases detection is performed using GATE JAPE transducer. The GATE JAPE transducer defines rules for forms of opinion phrases. It takes opinion tokens annotation as input and the output is the opinion phrases annotation. Before identifying the opinion phrases rules, using JAPE the reviews should be split into sentences by GATE Sentence Splitter to detect phrases for each sentence separately. Opinion phrases rules described with 1 BNF meta syntax notation that used to describe the syntax of phrases. Fig. 5 shows the suggested syntax by this approach of opinion phrases for Arabic language. The BNF identifies six rules (or cases) for the forms of opinion phrases rules, and two rules for compare two products.

Notes about BNF:
a) Product is the entity with full mobile name. b) Category is the first part of mobile name, that often represents company name. c) Kind is the second part of mobile name. d) Version is the third part of mobile name. Fig. 6 shows opinion phrases detection example. The grammatical analyzer detects opinion phrases to test eight rules. The bottom table in the figure shows necessary information, opinion words, polarity, mobile features, mobile name and rule identifier.
Weight either represents the number of opinions for specific polarity or the number of likes. Likes mean if someone has copied the same opinion and gets the same polarity. The polarity value is multiplied by weight + 1. +1 represents the opinion itself and avoids dividing by zero.
Final result based on ranges that are shown in the beginning of this section where 0.1333 ϵ]-0.5, 0.5], therefore, it is Neutral. The Table II shows example for each rule defined in the BNF. It must be pointed out that in Arabic grammar, the noun comes before the adjective, in contrast to English grammar, where adjective precedes the noun that is being described, in addition to some other differences, therefore the translation of examples is only for illustration and it is not necessary that is correct for English. For example, "Red Flower", in Arabic, it is written as "Flower Red" -" ‫الزهرة‬ ‫حمراء‬ ". Therefore, the illustration respects BNF rules and word order.   As for opinion tokens detection are evaluated by the dictionaries size and by stemmer for matching words. The Table III shows a test with 20 tokens extracted from several reviews consists of 100 words.
The Recall for opinion tokens is defined by the formula: From (3) and (4) results, the F-measure is defined by the formula: 0.88 R 0.9 0.86 As well as the opinion phrases detection are evaluated by measuring the quality of opinion rules that defined in BNF in the METHODOLOGY section. The Table IV shows a test with 35 phrases extracted from several reviews consists of 60 phrases.
The results show in followed figures some models for opinions summarization to build knowledge that can benefit decision makers: Fig. 7 shows bar chart for feature-based statistics about comparison of two mobiles Sony Xperia Z5 and Sony Xperia Z3. It shows polarity for each feature in range [-2, 2].

V. CONCLUSION
This paper proposes feature-based sentiment analysis for Arabic language, the target product is mobile phone. The results of this mining are demonstrated as the degree of strong positive, positive, neutral, negative and strong negative. This result is useful for both consumers and companies. This study presents an approach in active area for Arabic language. The f-measure rate from experimental result is 88%. The study presents an effective method for identifying opinion phrases by building Arabic grammatical analyzer with good result and expandable. The future works will be focusing on entering new categories of products and services, support grammatical analyzer with new rules, expand dictionaries, in addition to include other platforms of social media. The sentiment of emoji is one of the future works.