Context-Sensitive Opinion Mining using Polarity Patterns

The growing of Web 2.0 has led to huge information is available. The analysis of this information can be very useful in various fields. In this regards, opinion mining and sentiment analysis are one of the most interesting task that many researchers have paid attention for two last decades. However, this task involves to some challenges that a very important challenge is the different polarity of words in various domain and context. Word polarity is an important feature in the determination of review polarity through sentiment analysis. Existing studies have proposed n-gram technique as a solution which allows the matching of the selected words to the lexicon. However, identification of word polarity using the standard ngram method poses limitation as it ignores the word placement and its effect according to the contextual domain. Therefore, this study proposes a linguistic-based model to extract the word adjacency patterns to determine the review polarity. The results reflect the superiority of the proposed model compared to other benchmarking approaches. Keywords—Opinion mining; Polarity patterns; Pattern matching; Context-sensitive; Politics domain


INTRODUCTION
In the past, people tried very to acquire data and knowledge.Whereas the appearance of web, and especially Web 2.0, brings on huge information is generated by users.Although, this is not desire because too much information leads to confused.Therefore, analysis, summarization and other related tasks are very useful and applicable results present for users and researchers.In this regard, opinion mining and sentiment analysis are one of the most important and helpful task that has been defined from two last decades.
Opinion mining is a field that its results can be used for different researches of various fields.In this regard, researchers have been attracted because of its application from the sociological and psychological analysis to the extraction of users' opinion in business field such as about products and services, or political discussions.
The first time, Wiebe was presented the most widely definition for subjectivity and opinion mining based on a linguist's idea in 1994.Also, she defined as the linguistically expression of opinions, sentiment, emotions, evaluations, beliefs, and speculations ] 1 [ .Hence, the goal of subjectivity analysis is determining the subjective or objective sentences ] 2 [ that Liu considered subjective and objective sentences against each other.In this respect, objective and subjective sentences define, respectively, as a fact information indicative about world and a person's emotions or beliefs indicative ] 3 [ .
Next and in lower level, Zhang and Liu define opinion mining or sentiment analysis as "the computational study of opinions, appraisals, attitudes, and emotions about entities such as products, services, organizations, persons, events, and their different aspects" ] 4 [ .Also, they focus on recognizing the orientation and strength of polarity in different level using various methods and settings.
In this respect, various definitions in these years have caused to not be specific different between opinion mining and sentiment analysis and are used instead of each other.Of course, it is notable that some ones consider different levels for opinion and sentiment.
Regards, researchers investigated the various sides of opinion mining.For this purpose, some issues are defined such as how polarity is expressed, which level polarity is measured, or the polarity of words is fix or not.The change of polarity leads to context-sensitive opinion mining.For this purpose, some different solutions such as ontology are proposed for the context-sensitive problem.Although, the context-sensitive issue poses limitation to existing solution.For an example, the methods often consider the polarity of words as uni-gram.While words get together, they affect on the polarity of each other.
In contrast, n-gram is not a suitable method because the replacement of words is not considered for its statistical calculation; As a result, n-gram leads to the sparseness in repeating adjacent same words and lack of generalizability.For an instance, "supporting of terrorism" and "supporting of spies" are two different n-gram that their value of language model is very low.While the generality of them in pattern "supporting of (Neg.Exp.)" occurs more.
Thus, this paper proposes opinion mining is done using polarity pattern extraction based on language model helps to be possible words, which are synonym or have similar polarity, are replaced.
The polarity patterns categorize expressions which their polarity is same.For example, "supporting of terrorism" is negative; however, "supporting" is positive and "terrorism" is negative, and bag of word method often recognize the term as neutral.While the polarity pattern of "supporting of (Neg.Exp.)" leads to a negative result.Consequently, the results of context-sensitive opinion mining using polarity pattern matching express the significant improvement of accuracy rather than other methods.
In continue, section 2 presents a review on contextsensitive opinion mining.Then, context-sensitive opinion mining using polarity pattern matching and evaluation are discussed, respectively.Finally, conclusion is expressed.

II. A REVIEW ON CONTEXT-SENSITIVE OPINION MINING
Although many researchers have paid attention to opinion mining and sentiment analysis, the field poses several room of improvements to solve issues in data mining, web mining, and information retrieval.Therefore, different challenges are observed in the field such as how the polarity is calculated for word groups.So, the polarity is calculated using In addition, lexicon adapting is another strategy that finds context-sensitive polarity using contextual semantics and updates orientation and strength using a primary polarity lexicon and rule-base adapting ] 27 [ .However, the effectiveness of adjacent words is a significant criterion that has not been explored by any existing methods.Although language model can find adjacent words, it is typically rigid.Even co-occurrence and synonym words are not considered in statistical calculations.As a result, this paper proposes a method which is based on n-grams to categorize expressions that have similar polarity.Then, some polarity patterns are extracted that indicate for polarity language model.Finally, the polarity patterns are used to context-sensitive opinion mining and sentiment is calculated based on them.

III. CONTEXT-SENSITIVE OPINION MINING USING POLARITY PATTERN MATCHING
Different studies have used patterns for determining polarity.Pattern recognition is a helpful step for opinion mining tasks that detects text relations.In this regard, Riloff and Wiebe ] 28 [ extracted patterns to detect subjective and objective sentences.The patterns were extracted based on some templates and using corpus investigating.Furthermore, the patterns for getting together words and language model are two another effective methods.Wiebe  As a result, this paper proposes a method for opinion mining using polarity patterns which are the replacement of ngram and solve its problems.

A. Polarity Pattern Extraction
Polarity pattern extraction proposed is based on n-gram method and creates a polarity language model for opinion mining.In according to the investigations of corpus, if X is the noun phrase of affect, three types of phrases, that can be shown as sub-sentences too, are effective on polarities: • X + A: for example, ‫ﺗﺮورﯾﺴﻢ"‬ ‫"دﺷﻤﻨﯽ‬ means "hostility of terrorism" is negative when a negative affect is done by a negative effective.
• X + Conjunction + B: for example, ‫دﯾﻦ"‬ ‫ﺑﺎ‬ ‫"دﺷﻤﻨﯽ‬ means "hostility with religion" is negative when a negative affect is done rather than a positive affected.
• X + A + Conjunction + B: for example, " ‫ﺑﺎ‬ ‫ﺗﺮورﯾﺴﻢ‬ ‫دﺷﻤﻨﯽ‬ ‫"دﯾﻦ‬ means "hostility of terrorism with religion" is negative when a negative affect is done by a negative effective rather than a positive affected.
X, A and B are sub-sections that can be positive or negative and led various phrases with different polarity.These three type of phrases are templates for finding polarity patterns.For this purpose, a rule-based method is used and polarity patterns are extracted based on pre-defined templates which were mentioned before.The steps of proposed method are shown in "Fig.1".
Firstly, natural language processing is done on corpus and tokens and their PoS are determined.Then, the corpus is parsed and its dependency trees are extracted.As follow, the dependency trees are adapted to templates; for each phrase group, if p y and q Y are the p th phrase group and q th desired phrase group, respectively, that will be obtained by (1).
that p DT is the dependency tree of p th phrase group and i T is different templates with 1, 2,3 i = .Thus, the desired phrase groups will be selected if the dependency tree of p th phrase group is matched with one of templates i T .Also, the type of template, or i , which has been matched is determined.
In the next step, the polarity of sub-sections is determined.The sub-section X is a noun which is done an affect but A and B can be from a noun to a noun phrase which are an effective and an affected, respectively.For every q Y , an adapted template q AT is built based on matching among the desired phrase group, the polarity of sub-sections, and template iq T in (2).q q iq q aq q q b q iq q AT AT Y A

The polarity of sub section A if A T The polarity of sub section
that q index indicates q th phrase group.After that, the similar adapted templates are categorized using the polarity of each existing A or B sub-sections to be constructed primary polarity patterns.In meanwhile, every primary polarity pattern is an index for the set of adapted templates that their "A" and "B" have been replaced by polarity and their X are a set includes the noun phrase of affect.
(3) that k PP is the k th primary polarity pattern.Also, , h l ∀ expresses only an adapted template is considered as PP for replacing all adapted templates which the polarity of their sub-sections except X is same.www.ijacsa.thesai.org In the next step, the set of X of every primary polarity pattern is classified to positive and negative classes.The classification is done rather than the polarity of "X"s.Then, the classified primary polarity patterns and the final polarity of phrases are investigated to be extracted polarity patterns.For this purpose, final polarity patterns are provided based on the different states of "A" and "B" in similar primary polarity patterns which their polarity of "X" is same.If m P is the m th extracted polarity pattern, that will be presented based on (4) are the set of "X"s, and classified "X"s, respectively.According to the detected phrase groups in corpus, the corresponding polarity of every state is determined by experts, and as a result, the polarity patterns are prepared for opinion mining.
For generalization of polarity patterns, the sets of "X" are extended using semantic similarity strategies, because similar words can be replaced each other.For this purpose, the synonyms, hypernyms, hyponyms, co-occurrences, and colocations of all sub-sets of "X"s are extracted.Then, every word would be added to its set of "X" if there was not.Finally, the extracted polarity patterns and the prepared sets of "X" are used for context-sensitive opinion mining.

B. Opinion Mining Using Polarity Pattern Matching
The goal of proposed method in this paper is to analyze the opinion of texts using polarity pattern matching.The main stages of context-sensitive opinion mining are indicated in "Fig.2".
The first stage is the pre-processing for context-sensitive opinion mining, which involves normalization, tokenizing, sentence detection, and Part-Of-Speech (POS) tagging.The dependency parser [32] is used for syntactic relations identification besides words and their POS.Following, the syntactic relations are used for polarity pattern matching.
In the next stage, which is main stage for proposed method, the polarity identification of sections, which are matched with the polarity patterns, is begun.Firstly, the words of sentence are compared with the sets of "X".If each word is matched, its related phrases as sub-sections are extracted based on its corresponded polarity pattern and the dependency tree.Then, the polarity of sub-sections is measured as direct or recursive using polarity patterns.In meanwhile, the direct polarity of words are calculated using SentiFarsNet.Thus, the polarity of phrase is considered as the result of polarity pattern.
Finally, opinion mining measures the polarity of every sentence.After that the polarity of all sections of sentence was calculated, the polarity of sentence would be determined dependent on the polarities of sections.

IV. EVALUATION
This paper implements the context-sensitive opinion mining using polarity pattern matching.For this purpose, the polarity patterns were extracted for politics domain.Also, the SentiFarsNet that is the translation of SentiWordNet based on Farsnet is applied to find prior opinion.

A. Dataset
For this paper, we gathered data by a crawler from the political news.The news are collected from Mar. 2013 to Dec. 2015 on three news sites.Also, the crawled data contains 208,000 political news.After that, we selected our corpus from gathered data.The corpus includes about 14,000 news between 2KB and 10KB.Then, pre-processes were implemented on the corpus and dependency trees were obtained for opinion mining.

B. Results
According to the proposed method of this paper, the polarity patterns are extracted that TABLE II is shown some samples.The investigation of corpus indicates 38% group nouns belong to the polarity patterns.The first evaluation is calculating the exactness and completeness of polarity patterns.For this purpose, precision is defined as the fraction of obtained polarity patterns that are relevant to the content.In the other hand, recall is the fraction of the polarity patterns are relevant to the content that are successfully retrieved in documents.Finally, F-measure is an average of precision and recall that "Fig.3" presents the quality of the extracted polarity patterns.Whereas the coverage of required polarity patterns for opinion mining is the most measurement, the value of precision, which is more than 90%, demonstrates the performance of the proposed method.
Of course, the precision of n-gram is more and better than the proposed method but in contrast, the recall of proposed method is better.As a result, the polarity pattern method needs less calculations and has more f-measure than n-gram.Also, the accuracy of proposed method is compared with a base method that this paper uses bag of words and their polarity.The results are observed in "Fig.4".As the results show, the determination of polarity based on the polarity patterns leads to increase accuracy in opinion mining.The growing of Web 2.0 has lead to extend interesting to opinion mining in research society.Whereas the aim of opinion mining is to recognize the positive or negative opinion of documents and opinionated sentences ] 33 [ , the polarity of words and expressions is the most important issue.In the meanwhile, researches have shown polarity is not fix and even changes in a domain.Also, Ding expressed the polarity is fix only rather than a context ] 11 [ .Therefore, context-sensitive opinion mining is presented and indicates the primary polarity does not determine correct polarity.
Most methods often consider the polarity of words as unigram, but how words are placed adjacent to each other is very useful for opinion mining.Also, n-gram is used only as features for machine learning algorithms.
When words get together, they affect on the polarity of each other.In the other hand, n-gram is not a suitable method because the replacement of words is not considered for its statistical calculation; As a result, n-gram leads to sparseness and lack of generalizability.Thus, this paper proposes opinion mining is done using polarity pattern extraction based on language model helps to be possible words, which are synonym or have similar polarity, are replaced.The polarity patterns categorize expressions which their polarity is same.For example, "supporting of terrorism" is negative because "terrorism" is negative and the polarity pattern of "supporting of (Neg.Exp.)" leads to a negative result.Consequently, the results of context-sensitive opinion mining using polarity pattern matching show the significant improvement of accuracy the determination of polarity based on the polarity patterns rather than other methods.

NaturalFig. 1 .
Fig. 1.The steps of proposed method for polarity pattern extraction

T
are the polarity of sub-sections A and B, and i th template, which correspond with q Y .

Fig. 3 .
Fig. 3.The quality of the extracted polarity patterns and n-grams

Fig. 4 .
Fig. 4. The comparison of polarity patterns and bag of words usage V. CONCLUSION What do you mean by 'replacement of words is not considered for its statistical calculation'?do you mean the vector space model/bag of words?
scenario Give example to illustrate your idea.www.ijacsa.thesai.org TABLE I is two examples of polarity of phrases.