Survey of Tools and Techniques for Sentiment Analysis of Social Networking Data

—Social media has rapidly expanded over a period of time and generated a huge repository of content. Sentiment analysis of this data has a vast scope in decision support and attracted many researchers to explore various possibilities for technique enhancement and accuracy improvement. Twitter is one of the social media platforms that are widely explored in the area of sentiment analysis. This paper presents a systematic survey related to Social Networking Sites Sentiment Analysis and mainly focus on Twitter sentiment analysis. The paper explores and identifies the techniques and tools used in a well-structured approach to find out the research gaps and identify future scope in this area of research. The techniques evolved over time to improve the efficiency of classification. Total 55 research papers are included in this survey. The result reflects that Twitter is the most explored social networking site for opinion mining. Naïve Bayes and SVM machine learning algorithms are implemented in maximum researches. As the latest advancements, Stack based ensemble, fuzzy based and neural network based classifiers are also implemented to enhance the efficiency of classification. WEKA, R Studio, Python are mostly used tools by research scholars for implementation. The overall evolution of the research goes through various changes in terms of technologies, tools, social media platforms and data corpus targeted.


I. INTRODUCTION
The spread of information on social networking media like Facebook, Twitter, Instagram, Reddit, News forum etc. is comparatively faster than traditional social media platforms. Social media have become a rich resource of information for companies and research scholars that can be analyzed to get valuable information by using NLP (Natural Language Processing) and artificial intelligence techniques. The huge repository of information provided on social media platform is unprocessed and raw in nature, and over the time technologies are evolved to process the data and extract valuable information from that. This information can be analyzed and helpful in decision support and effective policy making in different areas related to business, politics, entertainment, medical and social uplifting. Sentiment analysis of social media posts deals with finding out the opinion, sentiment or feelings related to these posts. That can be mentioned at different levels of sentiments and mostly categorized as positive and negative. Several sentiment analyses and classification techniques like dictionary based, machine learning, ensemble based, neural network based, fuzzy based and hybrid are evolved over the period of time starting from the research in the area. Also, the targeted data size is increased and new tools are evolved for easy and effective evaluation of sentiment. Various research scholars have been doing research for more than a decade and research has gone through multiple phases with enhancement of technology and efficiency of outcomes.
Here in the present survey we have gone through a systematic literature survey and studied 55 finally selected research papers related to the area from 2009 to 2021. These 55 papers are selected after keen observation and following the criteria of inclusion and exclusion. We focus on Twitter sentiment analysis and provide the existing techniques used and scope of enhancement. There is abundance of research literature present in the field; we aim to find the relevant literature with respect to novelty of research, their applications domain and effectiveness.
Section I of the present survey paper gives the introduction of sentiment analysis for social networking sites. The research strategy used in the survey is mentioned in Section II. Research questions on the basis of which the survey is designed are mentioned in Section III. Section IV gives the details of related literature included in the survey. Survey outcomes of all the 55 research papers included are mentioned in Section V. Overall survey is concluded in Section VI.

II. RESEARCH STRATEGY DESIGN
The survey related to 'Social networking sites sentiment analysis' was undertaken systematically by following the steps mentioned in Fig. 1. At the very first step, research questions are designed to give a proper direction to the survey. We continue by retrieving the related literature and then selecting the pertinent research papers from those that fulfill the requirement as per research questions. Finally, the findings and results as started by the author are analyzed and mentioned along with tools and technology used in the research.  Table I gives the summarization and key findings of the survey. The results, technologies, findings and tool used in the research are also elaborated in the section below: Matthew et al. [1] in their research, implemented Bagging, Boosting and Random subspace method by using KNN, C4.5, SVM, MLP, RBF, LR as base classifiers. WEKA tool is used in the implementation of different classifiers. An enhanced performance is obtained with maximum accuracy of 90% over other approaches. Ensemble-based classifier performed better in all cases, particularly for noisy data, to enhance the overall accuracy of classification.
Kumar et al. [2] presented an article about the evolution of online social networks. The article investigates the dynamics of social cognitive theory and social networking. The research involved a photo sharing application (Flickr) and Yahoo 360 social network for analysis and implementation. Three different segments of networks are identified viz. singletons, isolated communities and giant component networks and detailed description of the evolution and structure of the three segments are researched on. The investigation of economical behavior of online social media networks is analyzed and user activity impact on incentives is examined.
A. Agarwal et al. [3] proposed twitter sentiment analysis by using POS specific polarity features and explored tree kernels to prevent the need for tedious feature engineering. 11,875 manually labeled tweets publically available from commercial resources were used in implementation. The conjunction of new feature with the previously proposed features and tree kernels outperform the base line classifiers.
Bae Y. et al. [4]  Lima et al. [5] in their research implemented Naïve Bayes algorithm for tweets sentiment classification as positive and negative on real time tweets downloaded by using Twitter4J Library. Tweets are classified on the basis of emoticons, sentiment based words or a hybrid of both. The results show an enhanced accuracy in case of hybrid approach.
F. Neri et al. [6] implemented sentiment analysis on Facebook posts related to news post related to Rai1 and La7 news programs. Facebook posts are analyzed using 'iSyn Semantic Center'. Bayesian method and K-Means algorithm are used as supervised and unsupervised classification techniques. The research shows the importance of Facebook for online marketing.
H. Kang et al. [7] proposed an enhanced Naïve Bayes classification algorithm for sentiment classification of review documents of restaurants. The 70000 review documents are obtained from restaurant sites including star information. The proposed model shows the enhancement of accuracy and precision.
M. Ghiassi et al. [8] developed a new lexicon specifically for Twitter opinion mining using n-gram feature vector and supervised learning method. The 3440 tweets are manually collected and labeled on 'Justin Bieber' twitter account and the model proposed in research is tested using these tweets. The results show the improvement in accuracy for the proposed model over SVM with an accuracy of 95.1%.
Hassan et al. [9] implemented Bootstrap ensemble framework (BPEF). It works in two stages: expansion and contraction. In the expansion stage, large numbers of models are generated based on the dataset, features and classifier parameters. In contraction stage a subset of these models is selected by throwing redundant and less useful models. The experiment results show that BPEF gives a high value of recall as compared to other methods. SIMS module of BPEF extracted a model with higher performance. E. Haddia et al. [10] show the role of preprocessing on sentiment analysis. Improvement is observed in the accuracies of TD-IDF matrix from 78.33 to 81.5, in Metric FF 76.33 to 83 and in FP matrix 82.33 to 83.
Patil et al. [11] in their research implemented SVM with and without feature extraction and show that SVM eliminated the need for feature selection due to the ability to generalize high dimension feature space.
Inoshika et al. [12] research on feature ranking and selection techniques for Twitter data opinion mining and suggested to remove unrelated words from feature space to reduce dimensionality that further reduces the sparseness of the feature set. The research also proposed a new feature selection technique on the basis of information theory named as Ratio Method.
Bac Le et al. [13] proposed a model based on NB and SVM. Information Gain, Bigram, Object-oriented extraction methods are used for feature ranking and selection to select more appropriate features. As per reported results, the proposed model is highly efficient with high accuracy for predicting feelings.
B. S. Dattu [14] implemented twitter sentient analysis by using SVM and Naive Bayes on real time tweets downloaded between the time periods 12 September 2010 to 24 January 2011 with the keywords 'NFL teams'. They pointed out in their research that SVM proved to be better than Naive Bayes algorithm for text classification and categorization. For unbalanced data, Naïve Bayes is more appropriate as there are fewer variations in results for unbalanced data.
O. Kolchyna et al. [15] implemented two techniques viz. Lexicon based and machine learning for sentiment classification of twitter messages. This research uses the sentiment score extracted from Lexicon classifier as an additional feature in the feature vector and the results shows the improvement in accuracy for imbalanced data set. The research show that incorporating sentiment lexicons with abbreviations, emoticons and social media slang enhances the efficiency of lexicon-based classifier. Feature generation and selection also play a vital role for the enhancement of classification accuracy. SemEval-2013 competition, task 2-B standard twitter data set is used and the outcome of the research shows that SVM and NB machine learning methods perform better. A combination of lexicon and machine learning method further enhance the accuracy by 7 percent.
Prusa et al. [16] worked on bagging and boosting-based ensemble classifiers. These two are the most widely used ensemble techniques in machine learning. In the research, both techniques are tested with the use of seven diverse base learners. All the ensemble classifiers build are compared with all the seven base learners to observe the performance enhancement. Total 21 learning algorithms are trained and finally tested on two different datasets, one large-sized automatically class labeled lesser quality dataset and other small-sized manually class labeled superior quality dataset. The research proved to be better and ensemble classifier enhanced the accuracy, regardless of the quality of the data set used.
K. L. Devi et al. [17] compared ensemble classifiers viz. Boosting and Bagging with the machine learning classifiers like NB, SVM and maximum entropy classifier. Feature selection is performed by using MI and Chi-square methods and is proved to be better than previously used methods. SemEval 2013, Task 9 data sets are used for implementation.
Y. Wan et al. [18] in their research, implemented majority voting-based ensemble classification model on various classification techniques including SVM, Random Forest, Naive Bayes, Bayesian Network and C4.5 Decision Tree by using 10 fold cross validation on a data set having 12864 tweets related to airline service Twitter dataset.
Prusa J. et al. [19] researched on 10 different feature selection methods and four classifiers viz. 5-NN, C4.5, LR, MLP. All classifiers by using all feature ranking and selection techniques are implemented on 10 different sized feature subsets up to maximum 200 features. The results of research show that filter-based feature selection, Chi-Squared (CS) improved classification performance for small-size feature sets. After comparing all the combinations of classifiers and feature rankers, it was observed that LR performed best with 150 features selected by KS ranker. The models performed better with larger number of features and the best models have features 75 or more. Only the feature rankers MI, ROC, PRC, CS and KS shows enhanced efficiency as compared to no feature selection.
R. Mansour et al. [20] in their research used multiple sets of features for sentiment classification by using an ensemble classifier. The classification complexity comes out linear with the increase in feature set size. The ensemble is implemented on two features sets; one optimal set with 20000 features and other NRC data set with 4 million features. The feature set with selected 20000 features has shown relative 9.9% and 11.9% performance gain over 4 million feature set.
O. Abdelwahab et al. [21] in their research demonstrated the effect of training set size on accuracy of SVM and NB classifiers. The Python NLTK library is used for implementation of classifiers. The results show that there is a little increase in accuracy if training data increases from 20 to 90 to percent. So a moderate size data can be trained to get acceptable results.
S. Akter et al. [22] predicted the sentiment for the Facebook posts using a lexicon-based sentiment analysis technique. The data set used in the implementation is FOODBANK the Facebook group in Bangladesh. In the research a console is developed using C# and Graph API is used to collect data.
M. Bouazizi et al. [23] proposed a new model for detecting the sarcasm using sentiment analysis of twitter data as micro blogging social networking sites are very useful in detecting sarcastic statements. A pattern-based sarcasm detection approach is used for twitter. A feature set with four relevant features for identifying different kinds of sarcasm is used and tweets are classified as non-sarcastic and sarcastic two classes. The model achieved an accuracy of 83.1% with 91.1% precision. SVM classifier is used and WEKA tool is used in implementation.
Grandin and Adan et al. [24] proposed a model Piegas for the sentiment of Portugal tweets. The Naïve Bayes classifier is implemented by using JavaScript and Ruby on Rails are used for the development of the system. The main requirement of the model is to develop a system with good usability and high precision.
Nádia et al. [25] proposed a new semi-supervised approach to solve the problem of cost of getting supervised data for machine learning. Unsupervised information retrieved from the similarity matrix created from unlabeled data is used with 224 | P a g e www.ijacsa.thesai.org various classifiers in place of classified data. Similarity matrix can be used as a powerful knowledge extraction tool to get information from non-labeled data. The results of the proposed framework show the improved accuracy for Twitter sentiment classification by using unlabeled data.
A. Tripathy [26] implemented four machine learning classification algorithms SVM, NB, Maximum Entropy (ME) and Stochastic Gradient Descent (SGD) on IMDb data set for sentiment classification. These classification models are implemented on unigram, bigram and n-gram features and it is observed that if the value of n is increased in n-gram after 2 than accuracy is decreased rather than increasing. For unigram and bigram accuracy is good but for trigram, four-gram, fivegram accuracy is decreased. Also, the use of count vectorizer technique and TF-IDF for converting the text into a matrix of weights, enhance the accuracy of classification.
A. Krouska et al. [27] in their research show the effect of preprocessing on the classification accuracy. The research also shows a major enhancement in result when IG is used for attribute selection. The research uses Unigram, Bigram and 1-3 gram feature vector. Preprocessing and feature selection enhance the accuracy. Unigram and 1-3 gram performed best among all.
K. Ali et al. [28] proposed SAaaS (Sentiment Analysis as a Service) framework to abstract sentiments of various social media information services. Public health surveillance related to social media based is done by using spatial attributes of social media users to find the location of disease outbreak. A new quality model is introduced to remove noise from the social media content. The real-world datasets from Twitter, Instagram, Reddit, news forum are used in the research. Sentistrength and Alchemy API tools are used in research. The Sentistrength tool is used for the analysis of short and informal text while Alchemy API for long and formal text.
A. U. Hassan et al. [29] In this paper, presented the method to detect the depression level of a person by fetching emotions from the social media text by using NLP and machine learning techniques on social media Twitter dataset and 20 newsgroups.
M. R. Huq et al. [30] used two techniques for sentiment analysis: First technique is a sentiment classification algorithm (SCA) based on KNN and the second is based on SVM. The research shows the comparative analysis of both the methods on the basis of recall, precision, accuracy, F-Score, TPR and FPR for 1000 tweets.
M. Ahmad [31] implemented Support Vector Machine (SVM) on two twitter pre-classified data sets for textual polarity detection. Recall, Precision and F-Measure are used for comparative analysis. The result shows that performance of SVM depends on the dataset itself. So it can be an area of research that what kind of classification algorithm is good for which kind of data set and what is the reason for that.
J. Brandon et al. [32] implemented twitter sentiment analysis to find the opinion of people for candidates in 2016 US Presidential elections. Lexicon based classifier and NB Machine learning classifiers are used on two data sets. One data set is a manually labeled Twitter data set and the other is an automatically labeled data set based on Hashtags and topic. A high correlation of 94 percent was found with polling data by using a moving average smoothing technique.
R. Wijayanti et al. [33] proposed an ensemble classifier based on a voting-based technique and used SVM, NB, LR (Logistic Regression) and Decision Tree classification algorithms for the implementation of proposed ensemble classifier. They used various feature representation techniques such as TF-IDF, sentiment lexicon score and term presence in their research. Ensemble classification results are proved to be better than individual machine learning classifiers, but ensemble accuracy highly depends on the selection of single classifiers used for creating the ensemble classifier.
Z. Jianqiang et al. [34] monitored the effect of six preprocessing techniques by using four classification algorithms (NB, SVM, RF, LR) and two feature selection methods. The result shows that accuracy is improved after using preprocessing techniques on the dataset. But removal of URL's, numbers and stop words hardly affects the accuracy, so they can be removed. Random deletion of words reduces the accuracy as the deleted word might be important in sentiment detection. NB and RF are more sensitive to the use of different pre-processing techniques.
Rahman et al. [35] analyzed the reliable decision making for a friend request to be accepted in Online Social Networks. Here, a quantitative study for analyzing the friend request has been carried out and the information regarding the social media websites were explained and information misuse of the other users and friends due to being deficient in trustworthy Friend Request Acceptance. In the research, a method is proposed for reliable friend request acceptance in Online Social Networks by finding out more details of the person who has sent the friend request.
Jianqiang et al. [36] proposed a method for opinion mining using deep convolution neural networks. Unsupervised learning is used for obtaining word embeddings by using a large set of Twitter data. The n-grams features combined with the word embeddings and polarity score extracted from sentiment lexicon are used for Twitter sentiment analysis. Sentiment classification labels were predicted after training the feature set with deep convolution network. GloVe-DCNN on the STSTd dataset performed best with accuracy 87.62%.
K. Tago et al. [37] performed an analysis based on Twitter data using user relationships and analysis of emotional behaviors. Here, two dictionaries of emotional words are analyzed using the machine learning classifiers and keyword matching is used for calculating emotion scores. Moreover, with different settings, three experiments were designed and these are the user's average emotion scores that were calculated. Using all the emotional tweets, the average of emotion score is calculated after user of few emotional tweets was excluded. Brunner-Munzel test was used to evaluate emotional behaviors to user relationships. As per results, positive users participate more than negative users in building a relationship in some particular conditions.
In Ikoro, Victoria, et al. [38], sentiment analysis of UK energy consumers is done by using messages posted on 225 | P a g e www.ijacsa.thesai.org Martin-Domingo et al. [46] used machine learning classification for airport service quality analysis on London Heathrow airport's Twitter account dataset by using machine learning sentiment analysis technique. They used Theysay and Twinword tools for implementation. Theysay performed better than Twinword with 78.7 percent accuracy as compared to 69.6% of Theysay. The purpose of research is to generate a list of service attributes that reflect the ASQ and results reflect that additional attribute does not reflect more accurate ASQ prediction.
M. Naz at al. [47] in their research implemented an ensemble classification model by using two classifiers K-Nearest Neighbor and Naïve Bayes. They used two feature selection techniques: Forest Optimization algorithm (FOA) and minimum redundancy and maximum relevance (mRMR). FOA is used for feature selection and mRMR for the removal of irrelevant features. As per the results of the research, ensemble classifier combined with feature selection technique performed comparatively better than the individual machine learning algorithms. Results are further improved by using an ensemble of KNN, NB and SVM. It is also evident from the research that the hybrid of FOAKNN and FOA-NB has outperformed single KNN and NB classifiers. Accuracy is increased when FOA and nRMR feature selection techniques are applied. The Blitzer's dataset, retrieved from the UCI repository related to the reviews of electronic products, is used for the implementation of various classifiers in the research. numbers and hashtags and lemmatization on Naïve Bayes machine learning classifier. Mapreduce of Hadoop is used for the implementation on Stanford twitter Sentiment data set. The proposed technique reflects an increase of 5% accuracy yielding to 73% for NB classifier.
R. Ahujaa et al. [50] implemented six classifiers viz. Decision Tree, SVM, KNN RF, LR TF-IDF, NB by using two feature selection techniques N-gram and TF-IDF on 'SS-Tweets' data set. The results show that TF-IDF feature selection show 3-4 % increase in performance as compared to N-gram feature.
M. bibi et al. [51] in their research proposed a new feature selection technique CAARIA "class association and attribute relevancy based imputation algorithm" that is proved to be better than IG and PC with an AUC (F-measure) value of 0.79. The research is performed on three twitter data sets HCR, SS-Tweet and FleTweetsPak on two machine learning classifiers SVM and NB by using WEKA tool. The newly proposed technique reduces feature dimension space by selecting tweets that have same class and carry useful information.
M. Bibi et al. [52] used hierarchical based clustering techniques named SL (single linkage), AL (average linkage) and CL (complete linkage) for the sentiment mining of twitter data. A combined framework architecture is built by using these three clustering techniques to select the best possible cluster with the help of using majority voting. The hierarchical clustering techniques proposed in the research are compared with k-means, SVM and NB classifiers. The outcome of research indicates that majority voting-based cooperative clustering is better in terms of quality of clusters but poor in term of time efficiency.
Z. Kermani et al. [53] used IDF, Term Frequency, sentiment scoring using lexicon dictionary SentiWordNet, semantic similarity for representing each feature weight of tweet in the feature vector. The percentage of contribution in the weight by each method is optimized and solved by genetic algorithm.  [54] in their research aim to classify the Facebook account as fake or genuine, on the basis of content generated and finding the correlation between user generated content. Credibility of an account is decided at two levels. First binary classification is applied to classify account as fake or genuine. After that, credibility score of the genuine class is calculated by using Analytical Hierarchical Process. On the basis of that score account credibility is decided. The research used machine learning and deep learning techniques for the identification of Facebook profile credibility by using Scikit-learn and Keras with TensorFlow. Scikit-learn is used for the implementation of machine learning techniques and Keras with TensorFlow for implementing deep learning.
George S.R. et al. [55] proposed a framework for opinion prediction for a product or brand name in Facebook during social distancing by using machine learning algorithms and netnography. The study actually proposes a conceptual framework and suggested various tools used by different researchers for opinion mining of Facebook viz. netnography, Google analytics, tweetstats, brandwatch, Facebook insights, sematrica's lexalytics, Google alerts and people browser.
Most of the work aim to find out the sentiment related to service or product by using dictionary-based, Machine learning based or hybrid classifiers. The overall purpose is to enhance accuracy and efficiency of classification models. Different researches are performed on different data corpus. Different tools are used by researchers to observe the variations in the outcome. Table I summarizes   In the present survey paper we have conducted a systematic survey of literature related to social networking site's sentiment analysis or opinion mining. From 2009 to 2021 several researches have been conducted and technology evolved from simple dictionary-based sentiment prediction to ensemble, fuzzy, deep learning and neural based sentiment analysis. Detailed outcomes of researches are mentioned in Table I with tools and technologies used. Various advancements occur in the area of attribute selection, preprocessing and classifiers used. Data becomes big and technology changed as per data need. From the detailed survey of included literature, it has been observed that Naïve Bayes and SVM are the most explored machine learning classifiers. Lots of work has been conducted in the area of ensemble classification technique. The most of the researchers are attracted by Twitter opinion mining and Facebook is the second most explored social media platform. Sentiment 140 is quite a frequently used data corpus. WEKA, RStudio, Python and NLTK are used in several research implementations. Facebook and Twitter are relatively less unstructured and sentiment analysis does not include image, audio or video. As media content can be in any one of these forms also, so sentiment extraction from these resources can be quite interesting and important but challenging too. So lots of work has been done on text sentiment analysis and most of them target to improve efficiency of classification.

VI. CONCLUSION
The manuscript presents a survey conducted on 55 different research papers related to social networking site's sentiment analysis. The survey reflected the evolution and enhancement of tools and technologies from 2009 to 2021 for sentiment analysis. Twitter is the maximum explored social networking site in the area of sentiment analysis. WEKA, RStudio and NLTK are most popular tools used by researchers. The area of text sentiment classification has been widely explored with the use of advanced classification techniques, big data technology, better simulation tools and most of them target to improve efficiency of sentiment classification. A new scope can be sentiment analysis from images, audio and video content as this area is comparatively untouched and huge repository of audio and video content is available on social media.