Sentiment Analysis using SVM : A Systematic Literature Review

The world has revolutionized and phased into a new era, an era which upholds the true essence of technology and digitalization. As the market has evolved at a staggering scale, it is must to exploit and inherit the advantages and opportunities, it provides. With the advent of web 2.0, considering the scalability and unbounded reach that it provides, it is detrimental for an organization to not to adopt the new techniques in the competitive stakes that this emerging virtual world has set along with its advantages. The transformed and highly intelligent data mining approaches now allow organizations to collect, categorize, and analyze users’ reviews and comments from micro-blogging sites regarding their services and products. This type of analysis makes those organizations capable to assess, what the consumers want, what they disapprove of, and what measures can be taken to sustain and improve the performance of products and services. This study focuses on critical analysis of the literature from year 2012 to 2017 on sentiment analysis by using SVM (support vector machine). SVM is one of the widely used supervised machine learning techniques for text classification. This systematic review will serve the scholars and researchers to analyze the latest work of sentiment analysis with SVM as well as provide them a baseline for future trends and comparisons. Keywords—Sentiment analysis; polarity detection; machine learning; support vector machine (SVM); support vector machine; SLR; systematic literature review


INTRODUCTION
With the rapid development of mobile internet in the recent years, the usage of micro-blogging websites have seen a considerable increment.On the other hand, trend of sharing the views and experience regarding products and services is increasing day by day.Moreover, users rely on the feedback of the previous customers before targeting the new product or service to purchase.In the same way the companies can get the reviews about their products or services from their clients via micro blogging sites (Facebook, twitter, etc.) to explore and analyze the response and ultimately can improve those targeted products or services.However, it is not practically possible to read all the reviews in tweets.Several researchers have been working to develop automated techniques and algorithms for sentiment analysis and text classification.The term sentiment analysis is basically aims to classify the given text into positive, negative and neutral category.Three basic approaches are available in the literature today for sentiment analysis: Lexicon driven, Machine learning based, and Hybrid (integration of lexicon and machine learning).The authors in [1] explored different Lexicon driven sentiment analysis tools and techniques.In [2], different machine learning techniques have been discussed in detail which are used for sentiment analysis.Moreover, in order to take the results up a notch, researchers combined lexicon based techniques and machine learning techniques to formulate a hybrid framework to dig-up even better results as explained in [3].SVM belongs to the supervised category of machine learning algorithms.Supervised machine learning algorithm is one which has to be trained first with the pre identified output class (training data) and then it makes itself capable to classify the real input data (test data).Several annotated datasets regarding different domains are available which are used by machine learning algorithms for classification and sentiment analysis.Some of these annotated datasets include: the customer review dataset [4], [5], Pros and Cons dataset [6], Amazon product review dataset [7] and gender classification dataset [8].In this study, published papers regarding sentiment analysis with SVM technique from year 2012 to 2017 are analyzed.Two online libraries are used for this research: ACM and IEEE.Nine hundred and one articles were selected based on the particular query strings.After following the complete systematic framework, 8 papers were finally selected for in-depth and detailed review.
Further organization of this paper is as follows.Section II describes the related work in this domain.Section III defines research methodology used for this SLR.Section IV presents in-depth review of selected papers.Section V discusses the findings of this detailed review.Section VI finally concludes the paper.

II. RELATED WORK
Development and refining the automated techniques of sentiment extraction and analysis is one of the hot research topics today.Many researchers have worked on sentiment analysis techniques via different approaches (Lexical, Machine Learning and Hybrid) however, in-depth analysis and review of latest literature on sentiment analysis with SVM was still required.Some of the related studies on sentiment analysis are as follows.
Authors in [9] conducted a systematic literature review regarding opinion mining from the reviews of mobile app store users.The researchers focused on the importance of mobile applications in now days and further highlighted the increasing demand of user reviews about those apps.Obviously these reviews are crucial for the new users, who are going to buy these apps and also for those who develop or sell these apps.www.ijacsa.thesai.org The authors highlighted the proposed solutions of mining problems, and also identified the remaining unsolved issues and new challenges.In [10], a systematic literature review is conducted to analyze the current state of Arabic text mining.For this review, more than one hundred papers are selected from different reliable sources and then were classified according to their specific domains.A quantitative analysis of selected articles is also conducted with respect to publication type, year, category and contribution.The researchers in [11] conducted a literature review on sentiment analysis and opinion mining of social issues.The selected papers have taken the data from social web sites.According to authors, different types of classification techniques, if combined, can provide the better results.In [12], a literature survey is conducted about opinion and spam mining.For this purpose, most cited research articles from these domains are considered.Authors found the proposed architecture and methods imperfect in those selected researches.They highlighted that the important thing in spam detection is not only the spam identification but also, not to filter the real ones.In [13], a systematic literature review is conducted for the classification of burn care parameters with machine learning techniques.A total of 1503 topic relevant research articles were primarily selected, after screening and extracting the most relevant literature, 15 studies were selected for the analysis.All the selected studies demonstrated the benefits of machine learning techniques in burn care however different research articles reflected different accuracies.The authors in this SLR focused on the benefits of using machinelearning techniques in burn care as well as highlighted the importance of common metrics and goals for effective evaluation and validation of these techniques.In [14], the authors have performed sentiment classification of Arabic tweets by using Naïve Bayes, Decision Tree and Support Vector Machine.In this study, a framework for Arabic tweets classification is followed which consisted of several subtasks such as: Term Frequency Inverse Document Frequency (TF-IDF) and Arabic stemming etc. Moreover three information retrieval metrics are used for performance evaluation: precision, recall, and f-measure.In [15], a literature review is conducted covering the domain of data mining applications in customer relationship management.The study considered the research literature from year 2000-2006, covering 24 journals.For this study, 900 articles were shortlisted and then 87 most relevant papers were selected to classify in four CRM dimensions i.e. customer identification, customer attraction, customer retention and customer development.In [16], the authors have predicted the rainfall in Malaysia by using machine-learning techniques.They have used following classification algorithms: Naïve Bayes, Decision Tree, Support Vector Machine, Neural Network and Random Forest.A comparative analysis was performed to identify the particular technique which can bring good results with little amount of training data.The comparative analysis showed that Decision Tree and Random Forest both can get well trained by using lower amount of training data and can get high F-measure score.However, Support Vector Machine and Naive Bayes both showed lower F-measure score, when trained with lower amount of training data.Neural Network required large amount of training data to predict very little amount of test data.In [17], the authors have focused on the effects of preprocessing feature in sentiment classification process.They have classified the 1000 Arabic tweets and compared their implemented stemmer with light stemmer.They have used two approaches for comparative analysis, Machine Learning and Semantic Orientation.According to authors, the used stemmer achieved 1% of improvement with Machine Learning approach.However, with semantic orientation approach, the improvement was 0.5%.In Machine learning approach, SVM used twice, once before applying the preprocessing phase and then again used after each stage of preprocessing to analyze the system's performance.They claimed the improvement of 4.5 percent in all measures.Same steps were adopted for semantic orientation approach and achieved 2-7% improvement in different measures.In [18], the authors have analyzed the performance of Support Vector Machine for polarity detection of textual data.A sentiment analysis framework is proposed and performance of SVM was evaluated on three datasets.Two datasets were taken from twitter and one from IMDB review.Performance of SVM was compared for each dataset by keeping in view three different ratios of training data and test data: 70:30, 50:50 and 30:70.Precision, recall and f-measure scores were used for performance evaluation.In [19], student's academic performance was predicted by using three data mining techniques: Decision tree (C4.5),Multilayer Perceptron and Naïve Bayes.These techniques were applied on student's data, which was collected from 2 undergraduate courses in two semesters.According to results, Naïve Bayes showed overall accuracy of 86% and outperformed MLP and Decision tree.

III. RESEARCH PROTOCOL
The purpose of this research is to extract the valuable information from most relevant research articles on sentiment analysis/opinion mining, published in last five years.
A Systematic literature review analyzes the gap between different researches, spanning within a particular time period as explained by [20].Research Protocol defines the structure in which different steps are specified which have to be followed in a particular sequence.For the selection of most relevant research articles with high quality measures, a detailed procedure is adopted in this study along with some specific structure and boundary lines as explained by [21] and [22].Guidelines for this Systematic Literature Review are also taken from latest review papers in software engineering domain such as [23], [24], [25].
Research protocol/methodology of this study consists of following steps (Fig. 1

B. Query String and Search Space
Query String is the combination of selected keywords used to extract the research articles from concerned libraries.
Keywords extracted from research questions are given below: Sentiment, Polarity, Opinion, Analysis, Extraction, Detection, Mining, Support Vector Machine, SVM The following search query is finalized with the above key words.
(("Sentiment" OR "Polarity" OR "Opinion") AND ("Analysis" OR "Extraction" OR "Detection" OR "Mining") AND ("Support Vector Machine" OR "SVM")) Two well-known search libraries are selected for the extraction of literature: ACM and IEEE.Both of these libraries have different characteristics and options to search the material.Therefore, slight adjustments are made in query string to obtain more relevant and appropriate literature.The Query had to be searched for multiple times with different combinations of selected keywords.Results of search query along with some significant parameters can be seen in Table I.

C. Selection Criteria
In this section, most relevant literature is selected with the particular selection criteria.The selection criteria further consists of IC (inclusion criteria) and EC (exclusion criteria).

1) Inclusion Criteria (IC)
Inclusion criteria is formed with the following rules:

2) Exclusion Criteria (EC)
Exclusion criteria is formed with the following rules: EC1: Papers which are not in English.

D. Quality Assessment
Quality assessment parameters must be followed in order to provide effective results.Following parameters are considered for this SLR to maintain the quality.
 Top rated scientific libraries are selected to find the relevant research material.
 Most recent research articles were selected to ascertain the best quality  Selection process is un-biased.
 All the steps of SLR (as discussed above) are followed in the true sense.

E. Data Extraction and Syntesis
After applying the search process (Fig. 2), 8 most relevant research articles were short listed as provided in Table II where CP stands for Conference Paper.

A. A Feature Based Approach for Sentiment Analysis by Using Support Vector Machine [Result Needed]
The authors in [26] developed a process for feature based sentiment analysis using Support Vector Machine (SVM).In the proposed model the dataset has to pass through five different phases before the conclusion of final result.First of all, the Sentence level classification is performed.Only reviews which have sentimental meaning are to be stored such as positive, negative or neutral.Questions or comments which aren't actually reviews will be filtered out using POS tagging as keeping them would led to an unnecessary extension of the vocabulary dictionary and unwanted scoring.After sentence level sentiment classification, extraction of aspects is performed which is most important and challenging task.POS tagging is used to extract words with tags like NNS (noun plural), NN (Noun), NNP (Proper noun singular), etc.In the next phase, the opinion words for aspects are extracted using the Stanford parser [34].After that the dataset are labeled using SentiWordNet [35].And finally for the opinion regarding whole product, SVM classifier was applied on the labeled dataset.SVM plots vectors in a 3D virtual space and distinctively allocates testing data's points to particular group which it belongs to, e.g.positive, negative, neutral or whatever the predefined groups are.The dataset considered for this research was taken from user reviews about laptops which were from a variety of companies like HP, Apple, Dell, Lenovo, etc.

B. Modeling Sentiment Terminologies: Target Based Polarity Phenomena
In [27], the researchers presented a subject sensitive sentiment analysis approach, which includes the context of tweets.According to authors the text cleansing techniques for input data before classification process can improve the results.Text cleansing includes normalization and vector representation of input data.They have pointed out that the subject aware classification brings the better results as compare to subject un-aware classification.The results can be further improved, if uni-gram approach is used instead of bi-gram or n-gram approach.A twitter dataset about word "Obama" was selected first.Features from tweets of selected dataset were extracted through Alchemy API, Tweet NLP and NTLK.From dataset, 30% of the data was used for training purpose and the rest of 70% as the test data.The collected tweets were scanned for feature extraction and then the features were stored in a separate dictionary -Keyword_Bundle -in conjunction with their specific topics to retain the target and context of the tweets.This technique further helped for the development of input matrix for SVM to classify the tweet with improved accuracy.Then two more datasets were selected "Movie Review", and "Apple" to have a comparative analysis.85.00 %, 84.00%, and 88.00% accuracies were achieved of "Obama", "Movie Review" and "Apple" datasets respectively, making a cumulative accuracy of 85.60%.

C. Multi-Aspect and Multi-Class Based Document Sentiment Analysis of Educational Data Catering Accreditation Process
In [28], authors presented an approach that classified the documents into multiple categories by keeping in view the multiple aspects.The existing problems of document level sentiment analysis such as entity identification, subjectivity detection and negation were also taken into consideration in this study.The proposed framework was used for educational data mining.The faculty performance was evaluated using the www.ijacsa.thesai.orgcomments provided by the students as feedback.The dataset contained 5000 comments about the faculty.The objective reviews which had no polarity inclination were filtered out, such as social comments, replies and questions.Java string tokenizer was used to divide the reviews into two token groups.After this, stopwords removal algorithms were used to remove special characters and some pronouns which would hold no significant value in the actual classification.They used TF-IDF to represent the acquired data in a numerical form, which is further used by the classifiers.Two machine learning classifiers, i.e.Naïve Bayes and Support Vector Machine were applied on the pre-processed dataset.81.00% and 72.80% accuracy were achieved by the SVM and Naïve Bayes, respectively for aspect based document level sentiment analysis.

D. Tweeple's Microblogs on Illegal Immigration in USA
Authors in [29] presented a process for opinion mining of tweeps (People who use Twitter).The topic that was specifically chosen in comparison with some other political topics was "Illegal Immigration" as it has been under discussion for decades in the US.The dataset used in this research was collected after the US Republican Presidential election debate on Oct 28, 2015.Three major categories of the topic were selected i.e. "reform/give citizenship to illegal immigrants", "deport all immigrants" or "deport only the criminal illegal immigrants".Binary classification of first two and multinomial classification of all three categories was done using the Random Forest, Multinomial Naïve Bayes, Linear SVM and Logistic Regression classifiers.The results obtained for all the four classifiers were promising with 82% of overall average.Linear SVM and ensemble based approach using Random Forest classifiers depicted optimal results and accuracy with the mean score of 90% and 84%, respectively for binomial and multinomial classification, for individual classes with lower error rate.

E. A Proposal of a Method to Automatically Estimate Evaluations of Various topic of Traverler's Reviews
Authors of [30] conducted a study to evaluate the performance of SVM.For performance evaluation, results of SVM were compared with the dependency search tree results.For the SVM based estimation, one-against-one method was used and this parameter was selected by Scikitlearn3, which is a programming package of python.The ease with which SVM can be extended to multiclass classification by one-against-one method played a key role in the performance evaluation.In addition, SVM's RBF kernel was used.The Dependency Tree Search was used for the comparison because the researchers expected that linguistic dependency relationship would prove to be useful for obtaining evaluation data from texts as evaluation words often appear after the evaluated object.In order to obtain evaluation data from scriptures, evaluationattribute dictionary was used.Three Polarities were defined: positive, negative and neutral.1000 Reviews from TripAdvisor of 2014 were used as the dataset for the experimental evaluation.Three different experiments were performed A, B, and C with different number of valued scores, 5, 3, 3 respectively.These three targeted different procedures.A was used to determine the basic results of machine learning.B was basically used to assess the estimation in laxer score.C was an estimation of individual topics using the dependency tree search so the results of machine learning and methods used by the dependency tree could be compared.This architecture is incapable of designing completely foolproof feature vectors.The researchers have suggested future work to focus on automatic estimation by machine learning.

F. Sentiment Analysis of Textual Reviews
In [31], the researchers presented an experimental study for performance evaluation of different approaches for documentlevel sentiment classification of movie reviews.The approaches included two supervised machine learning based classifiers: Support Vector Machine and Naïve Bayes, one unsupervised technique: Semantic Orientation Approach (SO-PMI-IR Algorithm) and one lexical driven approach: SentiWordNet.For Naïve Bayes, the multinomial version of NB was implement using Java with Eclipse IDE and the labeled dataset was fed as k-folds where k was chosen to 3, 5 and 10.For SVM algorithm the dataset was converted to vector space representation using TF-IDF, afterwards same k-fold scheme was used.The Unsupervised SO-PMI-IR algorithm was implemented using Java in accordance with a POS tagger.Firstly, POS tagging was applied on the data and then feature extraction was performed for each review.The SentiWordNet approach was implemented after performing POS tagging and feature extraction.In this approach, the researchers not only used the SentiWord's lexical dictionary but rather used an enhanced procedure to increase the result of classification to a greater degree of accuracy.This was accomplished by scheming out an adjective and adverb correlation in essence with SentiWord's predefined dictionary.In this method SentiWord's scoring and Adjective Priority Scoring (APS) were assigned different weighting and the combined score of the composites was used to compute final results.35% weightage was given to APS and the rest 65% to SentiWord's scoring.Two existent datasets of "movie reviews" were used along with one created individually for sentiment classification with different amendments in the procedures.Accuracy didn't fall out of the range of 65%-68% for SentiWordNet but SO-PMI-IR method went up to an accuracy of 89.00% but the only drawback is that a lot of PMI values have to be computed.On the other hand Naïve Bayes performed better than SVM.

G. Utilizing Hashtags for Sentiment Analysis of Tweets in The Political Domain
Authors in [32] presented a novel target-oriented hybrid sentiment analysis system.It consisted of three major modules: preprocessing module, lexicon-based sentiment feature generator module and finally Machine learning module.The pre-processing module performed the optimization process and normalized the data.Sentiment Feature Generation Module started with replacing slangs with English words holding the same meaning using a slang dictionary and then tagging all the words in the dataset either by score or type.A total of 14 feature types were selected by the researchers using this module.After the feature selection phase, the data was forwarded to the machine learning classifier, which was a linear SVM.The dataset used in the evaluation was based on the occurrences of the word "iPhone".It consisted of 940 tweets which were labeled by a group of 22 human annotators.470 tweets had a positive polarity whereas 470 tweets had negative polarity.The www.ijacsa.thesai.orgproposed hybrid model achieved an overall accuracy of 89.13% outperforming the SVM's Baseline accuracy of 86.70%.The researchers concluded that use of sentiment features instead of conventional text processing features can bring the better results.

H. A Boosted SVM based Sentiment Analysis Approach for Online Opinionated Text
Authors in [33] proposed a hybrid sentiment classification model.For evaluation purpose two different datasets were used.A "movie reviews" dataset which was acquired from imdb.com in 2004 and a "hotel review" dataset which was acquired from tripadvisor.comand yatra.com.The authors came up with hybrid architectures like Adaptive Boosting (AdaBoost)) or bagging combined with SVM.This research proposed to use bagging technique to construct the SVM ensemble.In bagging, several SVMs are trained independently via a bootstrap method and then they are aggregated to formulate a strong classifier via an appropriate combination technique.The vector space model (VSM) was utilized in order to generate the bag of words representation for each document.The text documents were pre-processed with basic natural language processing techniques like word tokenization, stop word removal and stemming.The residual tokens were arranged as per their frequencies or occurrences in whole dataset.The average accuracy achieved from both the datasets went up to 93.00%.The study goes on to conclude that SVMs usually suffer from biased decision boundaries (in case of the hyper plane), and their prediction performance drops significantly when the data is highly skewed.The authors concluded that the obtained results are considerably better when multiple technologies are used in correlation instead of using SVM alone.

V. RESULTS AND DISCUSSIONS
Finally, 08 research papers are selected by using systematic framework explained in Section II.These papers have been discussed in detail in Section IV of this research.Following answers are obtained against the identified Research Questions (RQs) while having an in-depth exploration and analysis of the selected papers.

RQ1: Which are the latest research trends in the domain of sentiment analysis?
As per systematic research process, 8 most relevant papers have shown the latest trends in the domain of sentiment analysis.The latest trends included the proposal of new techniques for polarity detection and sentiment analysis, customization of already proposed techniques and introducing the novel ideas to use the hybrid techniques more effectively.Moreover, one of the most important latest trends covered by our shortlisted papers is to target the new domain or area from where significant knowledge can be extracted by using classification techniques.RQ2: Which machine learning/lexicon/hybrid technique is considered for comparison with SVM?All selected papers [26]- [33] have used one or more techniques in comparison with SVM.The purpose of comparative analysis is to identify the difference between accuracy of that technique and the accuracy of SVM.The algorithms or techniques which are used in comparison with SVM include supervised machine learning, unsupervised machine learning, lexicon, and the hybrid of supervised and lexicon.
RQ3: Which areas of sentiment analysis are considered for investigation by the researchers?
The selected papers discussed sentence level sentiment analysis as well as document level sentiment analysis.For this purpose, different techniques are used including machine learning, lexicon based and hybrid.However, the focal point of investigation was the performance evaluation and comparative analysis to identify the best technique for sentiment analysis.

RQ4: Which factors affect the classification results?
All the selected papers have investigated the performance of their proposed techniques in terms of accuracy.To check the performance of any classification technique the output result has to be compared with pre classified or pre labeled dataset.It has been seen by analyzing the selected papers that accuracy of results may depend upon the following: the steps and techniques of preprocessing phase, the selection of input dataset along with its subject and ratio of training data & test data (in case of supervised classifier).Moreover, some of researches have claimed that the use of multiple techniques can bring more accurate results instead of using single technique.
RQ5: Which type of data sets are used for performance evaluation?
The selected papers have used the following as input dataset: tweets on different topics, user reviews about product or services and student comments about faculty.It also has been noted from the selected papers that the performance of sentiment classification techniques depends upon the selected dataset as well as the preprocessing techniques.

Limitations of Research:
Following are the limitations of this research: 1) Although all the published literature was obtained through a rigorous and thorough research process that depicts the completeness of this study however there may be still possibilities of missing some important relevant work.
2) The enhanced and optimized algorithms were mostly evaluated by the researchers themselves; therefore, the actual results might not be as accurate as claimed.This may affect the interpretation of this research.

VI. CONCLUSION AND FUTURE WORK
Sentiment Analysis is considered as one of the hot research topics in the domain of knowledge discovery.Large amount of online data is being added on daily basis ranging from social media posts and comments to movie and software reviews.By using sentiment analysis techniques, these data sources can be used to fetch the useful information such as: prediction of election results, getting user's feedback about any software, analyzing the market reputation of particular brand and obtaining public opinion before launching a new product etc.Multiple approaches are available for sentiment analysis such www.ijacsa.thesai.orgas lexicon based, machine learning based and the hybrid of both.SVM is one of the widely used machine learning techniques for the detection of polarity from text.Now days, along with conventional machine learning classification techniques, many customized and integrated models have been proposed by researchers for sentiment analysis and polarity detection.This study has provided a compact and comprehensive review of latest research by focusing on SVM technique of sentiment analysis.This study has followed a systematic framework for review and provided the answers of identified research questions after critical review of selected papers.For future work it is suggested to perform a comparative analysis of the customized techniques with same dataset.


): Identification of research questions  Selection of keywords for query string  Identification of search space  Outlining the selection criteria  Extraction of literature with selection criteria  Quality assessment of extracted literature  Data extraction and synthesis  Presentation of results www.ijacsa.thesai.org

Fig. 1 .
Fig. 1.Steps of SLR. A. Research Questions Research questions reflect the objectives of SLR and during the critical review of most relevant extracted articles; those questions have to be answered.Research questions of this SLR are given below.RQ1: Which are the latest research trends in the domain of sentiment analysis?RQ2: Which machine learning/lexicon/hybrid technique is considered for comparison with SVM? RQ3: Which areas of sentiment analysis are considered for investigation by the researchers?RQ4: Which factors affect the classification results?RQ5: Which type of dataset is used for performance evaluation?

IC1:
Papers published from year 2012 till 2017.IC2: Papers that used Support Vector Machine for Sentiment Analysis.IC3: Papers that used Hybrid Model for sentiment analysis, which includes Support Vector Machine.IC5: Papers that used other machine learning algorithms in comparison with Support Vector Machine.IC5: Papers that used other lexical/Hybrid techniques in comparison with Support Vector Machine.

TABLE I .
SEARCH SPACE Papers published before 2012 or after 2017.Papers that do not contain any results.Papers that used Hybrid Model, which does not include Support Vector Machine.Only those papers are shortlisted which are more relevant to the research questions.After applying IC and EC, 92 most relevant studies are found.All the remaining studies were excluded as defined in EC.
EC3: Papers which did not use Support Vector Machine.EC4: Papers that do not target sentiment/opinion/polarity analysis of textual data.EC6:EC7: