Hybrid Approach Used to Analyze the Sentiments of Romanized Text (Sindhi)

—Sentiment analysis is an important part of natural language processing (NLP). This study evaluated the sentiment of Romanized Sindhi Text (RST) using a hybrid approach and ground truth values. The methodology of sentiment analysis involves three major steps: input data, process on tool, analysis of data and evaluation of results. One hundred RST sentences were used in this study's sentiment analysis, which can be positive, neutral, or negative. The statements in the corpus of this study are simple to understand and are used in everyday life. This research used an online Python tool to process a text and get results in the form of outcomes. The results showed that 86% of the sentences have neutral sentiments, 9% of the total results of sentiment analysis have negative sentiments, and only 5% of sentences of Romanized Sindhi Text have positive sentiments. The accuracy of the RST was measured on an online calculator and the value was 87.02% on the basis of ground truth values. An error ratio of 12.98% was calculated on the basis accuracy found on the online calculator of confusion matrix.


INTRODUCTION
Sentiment analysis is the most important task of NLP, in which it analyses the community's opinions about social actions such as social media apps, academic activities, and technology [1,2]. Sentiment analysis is the analysis of opinions about users [3,4]. The principle part of artificial intelligence (AI) and man-made brainpower in NLP is to measure the content and investigate the importance of the content [5]. The information or text utilized for the Natural Language Processing looks like unstructured and organized information or text [6]. Text investigation is the cycle of changing unstructured content information over to organized content information as significant. Text examination utilized apparatuses for a few contents and measured factual information by utilizing artificial intelligence calculations [7]. Text investigation is additionally used to assess the client's assessment, and item audits with criticism are used to give a better reaction to future assignments. Text is utilized to recognize examples, and the fundamental thought of the examination comes from various wellsprings of data. Sentiment analysis is mostly used for the analysis of comments about any product or any other social activity [8,9]. This research focused on the Romanization of Sindhi language by using hybrid model for the evaluation/analysis of sentiments. After the analysis of results on tool, author evaluates the results on the basis of ground truth / reality basis. The significance of this research study is to evaluate the results of tool and actual aim of the sentence in selected language.
Sentiment Analysis applied in this study project on 100 sentences of RST. Sentiment Analysis is done on the online Python tool and it may the result in Positive, Neutral and Negative Sentiments. After the task performed on tool, all sentiments were compared on ground truth values and accuracy was measured as 87.02%.

II. STRUCTURE OF ROMANIZED SINDHI TEXT
Structure of sentiment analysis of RST is same as English language [10]. The structure of Romanized Sindhi Text depend upon the three main attribute of grammar as like: subject à Verb à Object, same as like an English sentence [11]. The structure of Romanized Sindhi Text is easy to understand by the tool and it recognized the sentence by using above three attributes and it may give better output [12].   1 show the methodology of sentiment analysis of RST depends on three major steps such as: Input data, Process on tool, analysis of data and evaluation of results. According to the first step of the methodology, input data is collected. The data in the RST is the input data in the form of sentences. After this step, the data is processed on a tool in the shape of sentences, but the sentences may be single or multiple. After the second step of processing, tool is to analyze the data and it may the output results in the shape of sentiments. These sentiments are positive, neutral and negative.

III.
ANALYSIS OF ROMANIZED SINDHI TEXT ANALYSIS One Hundred RST sentences were used in this study's sentiment analysis. A text's sentiment can be positive, neutral, or negative [13][14][15]. The statements in the corpus of this study are simple to understand. These sentences are used in everyday life. A data set is a fundamental element of research and is used as input towards the online Python tool (as shown in Fig. 2), which then processes a text and gets results in the form of outcomes. Sentiment analysis results, on the basis of ground truth value Data set of this research, is shown in below Table I.   Table I shows the data set of the research, which depends on one hundred sentences of RST. The data was obtained from the RST, which was obtained from different sources. Sentiment Analysis RST was done on the online Python tool. The   Fig. 3. According to the results of the Python tool, 573 sentences had neutral, 12 sentences had negative, and 8 sentences had positive sentiments on the sentences of RST. According to the results of ground truth values 323 sentences had neutral meaning, 64 sentences had negative meaning, and 208 sentences had positive meaning in the RST. According to the results of RST on Python, 86% of the sentences have neutral sentiments, 9% of the total results of sentiment analysis have negative sentiments, and only 5% of the total sentences of Romanized Sindhi Text have positive sentiments, as shown in Fig. 4.

VI. ACCURACY OF SENTIMENT ANALYSIS OF ROMANIZED SINDHI TEXT
Sentiment analysis of RST has been done using the online Python tool for 100 sentences. In sentiment analysis, positive, negative, and neutral sentiments were measured. Output results of the sentiment from tool were compared with the Ground Truth Value of the sentences. After performing the task of sentiment analysis, the accuracy of the output results from the tool was measured on the basis of the ground truth values of the sentences [16,17].
For the accuracy evaluation, a confusion matrix has been created on the basis of ground truth values, as shown in Table  III, True Negative (TNeg), True Positive (TP), True Neutral (TNeu), False Neutral (FNeu), False Positive (FP), and False Negative (FNeg) [16]. After calculating the values of the parameters from the confusion matrix, the accuracy of the RST was measured on an online calculator, and the value is 87.02% on the basis of ground truth values as shown in Fig. 6. Also, error ratio of 12.98% was calculated on the basis of accuracy found on online calculator of confusion matrix.

VII. ISSUES OF SENTIMENT ANALYSIS OF ROMANIZED SINDHI TEXT
Sentiment analysis of RST has been done on the online Python tool for 100 sentences. But during, before, and after performing the task of sentiment analysis on RST, faced issues with the completion of this task [18,19]. While performing the task of sentiment analysis on RST, positive sentences were not identified by the tool (Python), but after the characters of the Romanized text were changed, and then the results came. Other issues are discussed below: 1) Even when a single word (positive or negative) is used for sentiment analysis on tool, output result was a neutral sentiment.
2) Input sentences were interrogative used for sentiment analysis, but the results were neutral sentiment.
3) When Punctuation was used as input in sentences, the results were mostly neutral sentiment.

76%
17% 7% Neutral Negative Positive www.ijacsa.thesai.org 4) Input sentences were used as Negative, but the results were neutral sentiments. But for the negative sentences of English word, bad is used in Romanized Sindhi sentences, the result was negative. 5) Input sentences were used as Positive, but the results were neutral sentiments. But for the positive sentences of English word, positive were used in Romanized Sindhi sentences, the result was positive.
6) When country name comes in any sentences output comes as neutral.
7) When (ignore) word of English comes in sentences with subject+verb+Object of Romanized Sindhi text used on tool and the result came neutral. 8) Neutral output comes when sentences without subject are used on tool.

VIII. CONCLUSION
In this research, sentiment analysis has been done on Romanized Sindhi text using a machine learning tool (a hybrid approach) and ground truth values. The machine learning tool is a Python online tool that is freely available to perform different tasks of NLP by using input text. In this task, we used a data set of RST. Sentiment Analysis of RST has been done on 593 sentences, and the sentiments of the sentences are positive, negative, and neural. According to the results for the sentences of RST on the Python tool of sentiment analysis, 86% of the total sentences have neutral sentiments and as per Ground Truth Values of Sentiment Analysis 76% of the sentences have neutral sentiments. The overall accuracy of the sentiment analysis was measured from the confusion matrix, and the accuracy is 87.02%. Sentiment analysis of RST has been done using the online Python tool for the one hundred sentences. In sentiment analysis, positive, negative, and neutral sentiments were measured. Output results of the sentiment from tool were compared with the Ground Truth Value of the sentences. After performing the task of sentiment analysis, the accuracy of the output results from the tool was measured on the basis of the ground truth values of the sentences [16,17].