An Automated Framework to Detect Emotions from Contextual Corpus

—The emotion extraction or opinion mining is one of the key tasks for any text processing frameworks. In recent times, the use of opinion mining has gained a lot of potential due to the application of the potential customized aspects of the consumer relations and other customized applications. However, the application of sentiment analysis or opinion mining is highly challenging as the accuracy of the sentiment analysis depends on the input text corpus. The input text corpus can be highly fluctuating due to the inclusion of emojis or local language influences and finally the use of a wide variety of the regional languages. A good number of parallel research outcomes have aimed to solve these challenges in the recent time. However, most of the parallel research outcomes have primarily three challenges kept unsolved as firstly, the emojis in the text corpus is mainly removed but not translated into sentiment scores, secondly, the translation of the texts from various regional languages and the translation is mainly true translations rather than the contextual translation. Finally, the use of the dictionaries in the actual translation tasks takes a lot of time to process and must be reduced. Henceforth, in order to solve these challenges, this work proposed a framework to automate the weighted emoji-based sentiment analysis, Unicode based translation process to reduce the time complexity and finally use the collaborative sentiment analysis scores to build the final sentiment models. This work results into nearly 97% accuracy and nearly 50% reduction in the time complexity.


INTRODUCTION
Weizenbaum created ELIZA in 1964-1966. The artificial intelligence system ELIZA fooled people into thinking it could grasp human language by imitating their speech patterns. The "DOCTOR" was asking random questions in an attempt to learn more about "patient-centered" care. The concept of giving machines emotions shocked Weizenbaum. A mute ELIZA was unable to say a thing [1].
Regular users of social networking sites are more likely to rely on text messaging than any other form of communication. The emotional state of both parties affects the quality of their relationships. Natural language processing has certain issues when dealing with tweets. Despite NLP's development, data was still necessary for earlier work on low-resource languages. The six native Arab taggers at ExaAEC tagged 20,000 tweets. Plutchik's emotional "paradigm" includes these 10 categories.
LSTM and ELMO are used to categorize feelings. On the F1 scale, the agreement between our data and model is 0.65. The elimination of emotions changes [2]. The focus here is not on classifying or extracting emotions, but on understanding what sets them in action. Because we could not find any publicly accessible datasets, we opted to generate one using the Emotion Markup Language from the W3C. Here we detail how one may use a tree-based event representation framework to trace the roots of an emotional state. Under sampling-based bagging may be used to level out the training data's unequal distribution of events. The proposed approach can extract SVM features effectively with little training.
Natural language processing experts can decipher the underlying meaning of a written message by analyzing the writer's tone [3].
As it is, the problem is seen as a multi-label classification problem by the state of the art. These algorithms do not consider how a text makes the reader feel. Second, because diverse text fragments may predict different emotion labels, the model must gather effective attributes for each emotion. For this purpose, we use a topic-enhanced capsule network in conjunction with a variational autoencoder to classify a broad variety of emotions. The latent topic of a sentence is utilized to train a variational autoencoder, and a capsule module is employed for sentiment analysis. The proposed model outperforms the state-of-the-art [4].
Henceforth, after setting the context of this research in this section of the work, the rest of the work is furnished such as in the Section II, III and IV the foundational methods are discussed and based on the foundation methods, in the Section V the recent improvements are discussed. Further, in the Section VI and VII the identified problems and the proposed solutions are furnished. In the Section VIII, based on the proposed solutions, the algorithms and the frameworks are discussed. Furthermore, in the Section IX, the obtained results are discussed and compared with the existing research outcomes. Finally, the research conclusion is presented in the Section X.

II. FOUNDATIONAL METHOD FOR EMOJI DETECTION
In this section of the work, the foundational method for the emoji detection is furnished. The detection of the emojis is one of the prime factors during any text processing tasks. The accuracy of any machine learning driven algorithms relies on the correctness of the input data and the datasets with emojis www.ijacsa.thesai.org can deviate the accuracy of the machine learning methods to a greater extent.
Assuming that, the complete text corpus is T[] and the prime components in the dataset are the sequence number of the data items, n, and the actual text as t. This can be formulated as, [] , Further, the base line method indicates that, the set of unicode characters, UC[], to be separated, t*[], from the actual text and must be handled separately, as, Furthermore, the extracted Unicode characters must be validated against the specified range of the unicodes and must be assigned to the emoji collection as, Henceforth, the unicode removed text corpus, T1[], shall be presented as, Further, in the next section of this work, the foundational method for the text translation is furnished.

III. FOUNDATIONAL METHOD FOR TEXT TRANSLATION
In this section of the work, the baseline method for the text translation is carried out. The text translation for the text processing projects plays a very important role. The translation of the text implies the verification of the text analysis process and majority of the text analyzers focus on the English dictionary. Hence, the translation of the text from other native languages is highly appreciated for any text mining applications.
Continuing from the Eq. (4), the text corpus, which is free from the emojis, must be further filtered to remove the numeric values as, 1 Henceforth, the final translated text, T2[] can be taken to the next phase of the process.
Further, in the next section of this work, the emotion detection foundational method is discussed.

IV. FOUNDATIONAL METHOD FOR EMOTION DETECTION
In this section of the work, the baseline method for the emotion detection or opinion detection is analyzed.
Continuing from the Eq. (7), the text corpus further must be separated, T X based on the stop words, connectors and finally based on the phases as, After the detailed analysis of the baseline methods in the previous sections of this work, in the next section of the work, the recent improvements over the baseline methods are discussed.

V. RECENT RESEARCH REVIEWS
Extracting useful information from Twitter profiles is now trendy. From what you've stated, Twitter may infer how you feel about a topic. The newest graphs can detect your emotional state. It's only possible to utilize Western languages. The ability to recognize Indonesian speakers' emotions has been enhanced. Feelings of joy, sadness, fear, surprise, disgust, dread, and anticipation or fury (marah). As a result of its speakers' imbalanced emotions, Indonesian is imprecise. According to the findings, adjusting the pattern weight will help make the Indonesian data more comparable. The recommended strategy increases minority class accuracy (fear, surprise, and disgust) [5].
Emotional states impact the dynamics of interpersonal relationships. Text, speech, and body language all contribute to the communication of emotion, caring blogs and social media. Politics, both domestic and foreign, are discussed. The ability to evaluate the emotions described in books is helpful for success in social settings. The study of emotions in Bangla is quite new. Multi-class emotions are extracted from Bangla text using NB classifier, stemmer, POS tagger, n-grams, and word frequency-inverse document frequency analysis (tf-idf). [6] An overall accuracy of 78.6% was achieved in the final model.
Syntactically, NLP categorises expressions of emotion (NLP). When confronted with a classification issue with multiple labels, current methods perform well. This means that none of the methods can be used to demonstrate the model's ability to predict spatial correlations in sample data. In my www.ijacsa.thesai.org opinion, a fantastic strategy to improve efficiency and productivity is to use contrastive learning. To maximize the positives and minimize the negatives, they apply a compound loss function on top of a tweaked version of the commonly used Cross Entropy loss function in training. My recommendations were successful [7].
Better computer-human interactions, monitoring mental health, and adapting corporate strategies are just a few examples of the many applications for emotion detection. Deep learning outperforms all other AI methods in every measurable way. Single-emotion and multi-emotion samples are used by all machine learning algorithms for emotion identification. Our research shows that these systems benefit more from singleemotion instances than from mixed-emotion ones. Singleemotion works are unusual. Inefficient procedures exist. We used all available information to create a dataset where each piece of text represents at least two different emotions. Since CNNs are so good at analyzing pictures, we modified their design to determine how readers would feel about a given passage. We used rapid text and GloVe to detect semantic and syntactic similarities between text and numerical expressions. There has to be more accuracy in the construction plans; excellent estimate [8].
The encoder is bidirectional. Both Transformers representations and the Universal Sentence Encoder make advantage of transfer learning. The problem of overfitting arises in language models trained using a classifier on a limited data set. The majority of current studies focus on generalizing deep language models to new fields. When it comes to determining how someone is feeling, DeepEmotex turns on transfer learning. Twitter's data on user sentiment is collected, analyzed, and utilized for further refinement. Twitter is vital to its users. DeepEmotex has a 91% accuracy rate. EmoInt and Stimulus both validate Deep-Emotex. An average level of emotional engagement is 73%. When compared to Bi-LSTM, DeepEmotex-BERT performs 23% better. Both the extent of the dataset and the effectiveness of the models are examined. Modeling the task at hand and refining it based on how the intended recipient is experiencing [9].
The ASR capabilities are used by EPD (DSFs). Decoding DSFs through ASR is a little pricey for embedded systems. Based on the word history of the whole ASR system, this study presents a language model (LM)-based end-of-utterance (EOU) predictor that does not need a decoding approach in the test phase. Using a novel full-stack EPD approach, audio and language modelling techniques based on predictive encoding (PE) and explicit outcome usage (EOU) predictors, respectively, are combined into a single RNN-based AFEbased EPD strategy. Ensemble RNNs trained with information from the LM-based EOU predictor, the acoustic model, and the AFE-based EPD for each target independently are used in the proposed EPD (AM). An ensemble of RNNs is linked to a DNN to create the EPD classifier. Following that, they get retraining in an effort to cut down on errors once they reach their ultimate destination. The EPD framework analyzed the proportion of incorrect CHiME-3 endpoint responses versus the proportion of responses that were right. There is an enhancement here compared to EPD [10].
Low resource -It is estimated that 230 million individuals worldwide have some level of fluency in Urdu. Researchers put together benchmark datasets in languages with limited resources. The English to Urdu language pair is one way of translation. Google Translate has recently been updated to include the ability to handle polarity shifts and other forms of natural language processing. The accuracy of sentiment analysis and emotion recognition suffers as a consequence. This study's primary objective is to categorize people's emotions in light of whether or not they have a lot of available resources to work with. We describe five classes of polarityshifting nouns and verbs. It finds commonalities in the genesis of several languages. The accuracy of sentiment classification drops by 2% to 3% when transitioning from a language with a lot of resources to one with less [11].
Irony is the polar opposite of this. When analyzing text for sarcasm, emojis are seldom employed. This article discusses the use of satire on Weibo. After analyzing visual input, such as facial expressions, bi-LSTM networks next consult textual resources for more insight. Make your forecast using one of these three methods. Our comprehensive mining of sarcastic Weibo [12] was a smashing success.
The study of emotions is quite popular presently. Research on body language lags behind that on facial expressions and words. Surveys are designed to encourage participation from the general public. We take a look at the cultural and racial variations in body language. Human bodily movements are readable by our technology. Here, we investigate the use of RGB and 3D data for estimating body position. Current research on representation learning and emotion recognition in gestures is discussed. Similarly, speech and body multimodal techniques may be used. Massive scale data analysis of human identity, position assessment, and mood is now possible. The output spaces are inconsistent, and there are only simple representations [13].
Cyberbullying, especially among teenagers, has increased along with the development of new technologies. This article explores the role that emotions play in the identification of cyberbullying in India. We created BullySentEmo since there was no preexisting Twitter dataset that categorized bullying, sentiment, and emotion. Emoticons and short tweet-like text are encouraged to be embedded. In India, social media users often switch back and forth between the two languages. Multiple emoji manipulation is a feature of MT-MM-Bert+VecMap [14].
One way in which the prevalence of social media has altered the way we express ourselves verbally is by making us more likely to use strong language. Recognizing unsuitable content is a vital skill for anybody using the internet. The abundance of daily data has made automatic recognition a need. The effectiveness of hate speech detection algorithms is currently being studied. These methods are unable to detect polarity or tone in the language being studied. This is the first multi-task transformer-based technique for detecting hate speech in Spanish. Harassing communication that singles out certain groups often appeals to strong emotions or extreme opposites [15]. www.ijacsa.thesai.org Opinion miners and researchers into human-computer interaction use different approaches to sentiment analysis. In this piece, we emphasize the importance of working across disciplines. In human-agent interactions, sentiment/opinion detection algorithms are often used for opinion mining rather than socio-affective interactions (timing constraint of the interaction, sentiment analysis as an input and an output of interaction strategies). To substantiate our arguments, we look at phenomena connected to emotions, sentiment recognition technology, and the goals of socio-affective human-agent strategies. The next steps that must be taken and the unanswered issues are discussed. For the purpose of providing a more precise sentiment analysis [16], we include the specified criteria into the Greta platform for humanoid conversational robots.
An approach of feature extraction is required for micro expressions. Multi-feature fusion may be used to detect microexpressions. This technique establishes a connection between projection error and LBP features. In order to achieve rapid and precise identification, data utilized in the study was painstakingly extracted from specialized facial expression databases. In advanced learning environments, the novel method outperforms LBP; identification of objects in a photograph [17].
Recent research has shifted its emphasis from words to audio and video's non-verbal clues in order to make automated assessments of people's mental health. Textual content is as important as audio and video for depression detection systems. Comprehensively automated depression evaluation approaches need complex models of aural, visual, and textual elements. Firstly, the existing system is successful into detecting and separating the emojis, however the same detected emojis are not translated to sentiment scores.
Secondly, the baseline method recommends using the traditional dictionaries to convert the multi-lingual text corpuses to the standard text; however the text translating dictionary size can be overwhelming due to the fact that, the systems intended to translate a wide variety of the source languages.
Assuming that, the average size of the dictionary for one language is n and the diction, D[], is furnished for translating m number of languages. Hence, from the Eq. (7), the time complexity, T, can be furnished as, Or, 2 ( ) ( ) | T n O n n m  (14) Finally, the detection process for the emotion is not contextual.
As per the Eq. (10) Henceforth, based on the identified challenges in the existing systems, in the next section of the work, the proposed solutions are furnished.

VII. PROPOSED SOLUTIONS
After the detailed analysis of the existing methodologies and the persistent research problems, in this section of the work, the proposed solutions are furnished.
Firstly, the emoji detection and translation to the sentiment process is furnished here. This process can be repeated for all the dictionaries to be included and still the complexity for processing the total dataset can be limited to O(n/2) during the average case complexity.
Finally, the process for the relative sentiment score analysis process is furnished.
Continuing from the Eq. (10) Furthermore, based on the proposed mathematical models, in the next section of this work, the proposed algorithms are furnished.

VIII. PROPOSED ALGORITHMS AND FRAMEWORKS
After the detailed analysis of the existing system, challenges in the existing systems and the proposed mathematical models, in this section of the work, the implementable versions of the proposed algorithms are furnished. Step -3. Return FS[] The details of these algorithms are furnished in the previous sections of this work.

Input
Further, the automated framework is presented here (Fig.  1).
Identifying if a document, sentence, or object feature/aspect is conveying a favorable, negative, or neutral viewpoint is a fundamental task in sentiment analysis. A wide range of human emotions, including as joy, anger, disgust, sadness, fear, and surprise, are analyzed by sentiment categorization algorithms that look "beyond polarity."

One of the earliest examples of what would become modern sentiment analysis was published in The General
Inquirer, while other precedents include independent psychological investigations that analyzed a person's mental state by analyzing their speech. As a result, the method disclosed in Volcani and Fogel's patent zoomed in on sentiment and singled out words and phrases in text with respect to various emotional scales. Based on their findings, EffectCheck provides a list of interchangeable words that can be used to adjust the level of emotional impact.
Further, in the next section of this work, the obtained results are discussed.

IX. RESULTS AND DISCUSSIONS
After the analysis of the existing system and the proposed system, in this section of the work, the obtained results are discussed.
Firstly, the dataset descriptions are furnished here (Table I).   TABLE I. DATASET ANALYSIS [19,20]

Dataset Name Number of Instances Release Date
Sentiment140 [19] 16,000 2009 Amazon Reviews for Sentiment Analysis [20] 25,900 2022 This is the sentiment140 dataset. It contains 16,000 tweets extracted using the twitter api. The tweets have been annotated (0 = negative, 4 = positive) and they can be used to detect sentiment.
Also, the second dataset, this dataset consists of a 25,900 Amazon customer reviews (input text) and star ratings (output labels) for learning how to train fastText for sentiment analysis.
The framework is tested on all the data items, however, only 15 for each dataset is furnished here.
The outcome of the emoji detection is displayed here (Table II). The detection process is highly time-efficient, which is furnished in the further part of this work, and highly accurate. The accuracy of the emoji detection for the displayed sample is nearly 85% and for the total datasets, the accuracy is nearly 93%.
The outcome is also visualized graphically here (Fig. 2). Further, the translation scores are also analyzed here (Table  III). The validation process for translation is compared with the google translation and the obtained scores are compared.
The translations scores obtained from the proposed system are highly appreciable. The average accuracy for the displayed samples is 98% and for the complete dataset the accuracy is 99.89%. The outcome is also visualized graphically here (Fig. 3). Further, the detection sentiment scores from the emojis are furnished here (Table IV). During the translation and extraction of the sentiment scores from the emojis the classes are denoted as 1 as very negative to 5 as very positive. However, the overall sentiment scores from the emojis are calculated using the mode method.
The outcome is also visualized graphically here (Fig. 4). Further, the sentiment analysis collaboratively from the text and from the emojis is performed here and the results are furnished (Table V). The outcome is also visualized graphically here (Fig. 5). Finally, the time complexity analysis is carried out (Table  VI). It is natural to observe that, the time complexity mean is 0.19 ns for an average length of 28 characters with average of 5 emojis.
The outcome is also visualized graphically here (Fig. 6).

X. CONCLUSION
One of the most important jobs for any text processing system is emotion extraction, sometimes called opinion mining. The potential personalized parts of customer relations and other customized applications have given opinion mining a lot of potentials in recent years. However, since the quality of the sentiment analysis is dependent on the text corpus that is used for the analysis, putting it to use can be somewhat difficult. Due to factors such as emoji usage, regional language effects, and the use of many different regional languages, the input text corpus can be quite volatile. Many recent study findings have taken a multi-pronged approach to addressing these difficulties. However, the majority of the results from similar studies fail to address three key issues: (1) the removal of emojis from the text corpus without converting them into sentiment scores; (2) the translation of texts from different regional languages; and (3) the translation of texts is primarily literal rather than contextual. Finally, reducing dictionary use in translation activities is important because it is a timeconsuming procedure. As a result, this study provided a framework for automating the weighted emoji-based sentiment analysis, streamlining the Unicode-based translation process to cut down on time complexity, and using the collaborative sentiment analysis scores to construct the final sentiment models. The temporal complexity is reduced by approximately half as much as a result of this study and the accuracy is nearly as high as 97%.