A Comparison of Classification Models to Detect Cyberbullying in the Peruvian Spanish Language on Twitter

Cyberbullying is a social problem in which bullies’ actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine


I. INTRODUCTION
Harassment in social networking sites, better known as cyber bullying, has silently impacted many people in recent years. The most prominent acts of virtual harassment occur through rumors, insults, threats, humiliation, and sexual harassment [1]. A survey conducted in 28 countries across the world revealed that 17% of young people experience cyberbullying before the age of 25 [2]. In Europe, 13-15-yearolds are more likely to be bullied online [3]. On the other hand, the Asia-Pacific countries present around 53% of cyberbullying experiences on social networks, followed by the Middle East and Africa with 39% [2]. Regionally in America, 59% of United States adolescents have experienced some form of cyberbullying [4]. Meanwhile, Latin America experiences the highest amount (76%) of cyberbullying on social media platforms [2]. In Peru, a study revealed that at least 58% of kids between 8-12 years old are prone to online harassment [5].
Despite the efforts to prevent cyberbullying events and mitigate its effects [6,7], the problem coexists with a generation that is always connected to different social media platforms through the Internet, using a computer or mobile phone, where they interact between groups [8]. Moreover, the use of popular social network platforms, such as Twitter, which offer tweet posting anonymity, encourage harassing behaviors with more frequency and cruelty [9], negatively affecting the self-esteem of the victims. Hence, automatic cyberbullying detection becomes important.
Currently, typical cyberbullying detection approaches employ text analysis subtasks such as pre-processing, feature extraction, feature selection, and classification to identify online harassing events. Despite such a well-defined pipeline, there exist very few works in the literature aiming at detecting cyberbullying in textual data from social media written in other languages different from the English language [10][11][12][13]. Furthermore, there are a limited number of works trying to solve the automatic cyberbullying detection problem in Spanish languages [14][15][16][17].
In this work, we propose to compare four machine learning algorithms for detecting cyberbullying on Twitter textual data written in the Peruvian Spanish language. To reach our goal, we have built an annotated text messages dataset from Twitter written in Peruvian Spanish. The dataset was validated with the help of human participants through an online service specially created to verify and annotate the offensive content according to no harassment, direct harassment, hate speech, and sexual harassment [18,19]. Then, we have used Natural Language Processing (NLP) techniques for pre-processing and subsequent feature extraction. Finally, we have trained and assessed the performances of a Naive Bayes (NB), Random Forest (RF), Support Vector Machine (SVM), and Multinomial Logistic Regression (MLR) classifiers.
The rest of the paper is structured as follows. The next section presents the related works aiming to automatically detect cyberbullying in social media. Section III describes our methodology to perform the automatic detection of cyberbullying events on Twitter. Section IV provides details of the experimental procedure adopted to perform classifiers' performance comparison as well as presents and discusses the experimental results. Finally, Section V summarizes our research findings along with suggestions for potential future work.

II. RELATED WORKS
In recent years, automatic cyberbullying detection in social media has attracted the attention of the scientific community. The early works of Dinakar et al. [19] and Yin et al. [20] demonstrate the researchers' interest in detecting cyberbullying events in social media textual data using supervised machine learning tools. In 2016, Di Capua et al. [21] explored ways of combining semantic, syntactic, sentiment, and social features www.ijacsa.thesai.org within the machine learning pipeline to detect cyberbullying on large data streams from YouTube, Twitter, and Formspring. Chatzakou et al. [22] studied text features, user features, and network-based features to find the set of features that best distinguish bullies and aggressors, thus, detecting bullying and aggressive behavior on Twitter. Later, Davison et al., [23] aimed to identify different types of cyberbullying on Twitter data via a multiclass classifier. Park and Fung [10] employed traditional supervised classifiers and neural network-based models to identify sexist and racist posts on Twitter. Chen et al. [11] aimed to find the best suited supervised classifier at detecting harassment in manually labeled social media comments from Twitter and Facebook. Most recently, Lee et al. [12] investigated the efficacy of traditional machine learning and neural networks-based models at detecting abusive language on a Twitter dataset. Hani et al. [13] extended the work of Reynolds et al. [24] at detecting cyberbullying events in text messages from Formspring.me by introducing a set of new classifiers. Such a group of works exposes the scientific efforts made to detect cyberbullying from textual data written in the English language.
However, the cyberbullying issue is common across countries and languages. In this sense, Ptaszynski et al. [25] developed a systematic approach upon machine learning techniques to automatically detect cyberbullying entries in the Japanese language. Van Hee et al. [26] trained an SVM classifier in a Dutch text messages dataset collected from Ask.fm social network to identify seven cyberbullying-related categories, thus, detecting cyberbullying events. Similarly, Del Vigna et al. [27] assessed the SVM and a neural network-based classifier on the task of hate speech recognition upon a manually annotated Italian corpus of Facebook. Özel et al. [28] considered a feature selection stage within the machine learning pipeline, to detect cyberbullying in Turkish text messages using labeled data from Instagram and Twitter. Haidar et al. [29] presented a machine learning-based approach to detect cyberbullying in the Arabic language from Twitter textual data collected across the Middle East Region countries. Furthermore, Mouheb et al. [30] presented a real-time cyberbullying detection system in Twitter streams that classify bullying messages according to the offensive strength. On the other hand, Bai et al. [31] focused on detecting offensive speech in German social media through a binary classification scheme that considers traditional supervised classifiers and neural networks models. Most recently, Nurrahmi and Nurjanah [32] employed text processing and machine learning techniques to detect bullies from the automatic analysis of Twitter posts written in the Indonesian language. Also, Febriana and Budiarto [33] constructed a dataset of Twitter posts collected during the presidential election period in Indonesia to promote the detection of hateful speech and tested its usefulness by submitting it to a basic sentiment analysis model. Win [34] used the SVM algorithm on a set of textual data collected from Facebook in the Myanmar language to discriminate bullying messages.
In a different direction, some authors have addressed the cyberbullying detection task through the implementation of multilanguage cyberbullying detection platforms. For instance, Unsvåg and Gambäck [35] conducted experiments on Twitter text messages written in English, Portuguese, and German languages to measure the effects of including Twitter user's features on the hate speech classification task. The authors observed that tweets with similar content written in different languages hinder the classifiers' performances. Pawar and Raje [36] modeled linguistic patterns upon a hand-labeled bilingual (Hindi and Marathi languages) dataset using Machine Learning and Natural Language Processing techniques to detect cyberbullying in Twitter and Internet forums. Moreover, Steimel et al. [37] experimented with a general cyberbullying detection model across multiple languages (English and German) with data collected from Twitter. Their findings showed that multilingual classifier optimization is not possible even in environments that use comparable datasets.
Despite the efforts to tackle cyberbullying detection in social media, the works aiming at detecting offensive behavior in the Spanish language are yet scarce. For instance, Gómez-Adorno et al. [14] addressed the detection task as a binary classification problem, employing supervised Machine Learning models to detect aggressive tweets, a cyberbullyingrelated topic, in a Mexican-Spanish language dataset proposed in the 2018 edition of MEX-A3T contest. Similarly, Molina-Gonzáles et al. [15] proposed an ensemble of supervised classifiers to identify offensive messages on the 2019 edition of MEX-A3T. Gutiérrez-Esparza et al. [16] developed a classification model to detect cyberbullying events (i.e., racism, violence based on sexual orientation, and violence against women) on a Mexican-Spanish textual dataset collected from Facebook. The authors highlight the participation of school professors and psychologists, with experience in evaluation and intervention in cases of bullying, during the annotation process. Finally, in a more recent study, López-Martínez et al. [17] proposed an online-tool capable of detecting cyberbullying from tweets written in Spanish. The authors combined Open Source Intelligence tools with Natural Language Processing techniques to compile information from the victim's Twitter account and analyzed tweets from every follower.

III. METHODOLOGY
Currently, there exist several works focused on detecting cyberbullying in social media. However, the vast majority focuses on text analysis in the English language due to the availability of resources for text analysis, including textual datasets. Such a lack of works aiming for cyberbullying detection in other languages is primarily due to language variants and its grammar complexity. Language variants are specific to a region and vary according to demographic and social factors, such as the appearance of words according to the dialect, idioms, and colloquialisms [16,38]. Language grammar complexity, on the other hand, is attributed to morphology and syntax rules, such as gender and number derivations, verb conjugations, enclitic forms, superlatives, and diminutives suffixes, among others [39]. Therefore, it is paramount to consider both aspects when acquiring textual data intended to model cyberbullying in social media.
In this work, we propose the automatic detection of cyberbullying through the identification of its four categories in an analysis of Spanish tweets collected from Twitter users www.ijacsa.thesai.org resident in Peru. Our method combines Natural Language Processing (NLP) and Machine Learning (ML) techniques to establish a correspondence between the users' tweets and the types of cyberbullying, namely, no harassment, direct harassment, hate speech, and sexual harassment [18,19]. A class label is assigned to a tweet according to the conventional four-stage classification scheme, as shown in Fig.1 the Dataset Collection stage gathers a set of tweets from Peruvian Twitter users; the Pre-Processing stage improves the data quality by removing inconsistencies from the tweets; the Feature Extraction stage obtains a compact representation (x) of a tweet; finally, the Model Selection Stage choose the best-suited classifier to solve the automatic cyberbullying detection problem via a classifiers' performance comparison.

A. Dataset Collection
In this work, we have constructed and made publicly available 1 a dataset consisting of a collection of 10,096 tweets in Spanish from comments and interactions between Peruvian Twitter users with the help of the Streaming API 2 tool. We collected the dataset during August 2019 and January 2020 from users with an age range between 14 and 60 years old. To ensure class discriminability among tweets, we included common words, jargons relative to Peruvian people, and offensive words during the tweet retrieval process. Furthermore, we have added a geographical delimitation filter after the tweets retrieval process to ensure that the collected tweets belong to Peruvian users only. The filter is part of the Streaming API tool, which is composed of delimiting quadrants with the latitude and longitude coordinates of the different regions of Peru.
The collected tweets were labeled with the help of human participants, who were mostly undergraduate students from the last year of Psychology, Communications, and Law schools from different universities in Peru. The participants evaluated a set of twenty randomly selected tweets via a website specially created to guarantee anonymous sessions not to reveal the participant' identities. In one session, a participant assigns a class label to each tweet from the set of twenty tweets according to the four cyberbullying categories. Moreover, we made cyberbullying categories definitions available throughout the labeling process, and we also ensured that a tweet gets evaluated by at least three different participants to avoid labeling conflicts [40].
Finally, after applying the region based filtering and tweet labeling processes, we obtained a dataset comprised of 10,096 tweets, which class distribution corresponds to 5122, 2127, 1000, and 1847 observations for the no harassment, direct harassment, hate speech, and sexual harassment, respectively.

B. Pre-Processing
In this stage, we performed a set of transformations over the original tweets in the dataset to enhance data quality and facilitate its processing for further analysis. In this sense, we first removed symbols, hash tags, mentions, digits, emoticons, and web links from the dataset. Then, we eliminated repetitive characters, using regular expressions, to correct spelling errors except for the consecutive characters r, l, c, and e, because they represent single sound letters, e.g., "aburrido", "llamada", "acción", "reenviar". Then, we converted all the tweets to lowercase to standardize the data. After that, we applied a word tokenization technique overall the tweets to translate the Peruvian jargon to words with the closest meaning in the Spanish dictionary, e.g., "yapa" to "extra" or "monse" to "aburrido". Finally, we eliminated the stopwords, such as y, a, pero, que, tu, among others, because they often are irrelevant to the tweets analysis in further steps.

C. Feature Extraction
The feature extraction stage aims at establishing relationships between words in a tweet that might help discriminate the intent of abuse. Therefore, here, we used a set of techniques oriented to the semantic and syntactic analysis among words, whose objectives are to relate groups of words to establish the intention and context in which they were used. To perform the semantic analysis, we used stemming and lemmatization techniques implemented with a neutral Spanish dictionary in the Snowball Stemmer 3 and Spacy 4 tools, respectively. On the other hand, we based the syntactic analysis on the n-gram technique, specifically in its bi-gram and trigram variants, using the nltk 5 library. It is worth mentioning that we applied these techniques before the stopwords removal in the pre-processing stage to maintain the context of the message, e.g., "no eres tonto" is different from "eres tonto". Subsequently, we used the TF-IDF statistical measure to obtain numerical representations of the tweets and the frequency of their words, allowing us to know the degree of importance of a feature. Specifically, we complemented the stemming and lemmatization semantic representation techniques with the TF-IDF technique, and the bi-grams and tri-grams syntactic feature extraction techniques with the TF-IDF method.

D. Model Selection
The model selection stage's purpose is to select the bestsuited classifier in detecting the four types of cyberbullying from tweets posted in the Peruvian Spanish language. Hence, we conducted a performance comparison among the most common supervised algorithms for text classification problems. Specifically, we trained a Naive Bayes (NB), Multinomial Logistic Regression (MLR), and Random Forest (RF) classifiers, which are suitable when working with a large number of features [41][42][43]. We also trained a Support Vector Machine (SVM) classifier, which has proven to behave well in text classification tasks with small class samples [44]. Such models were implemented using the Scikit-Learn 6 library for Python and were set to work upon their by default parameters.

IV. RESULT ANALYSIS
In order to assess the performances of the four classification algorithms on the cyberbullying detection task over Twitter textual data written in Peruvian Spanish language, we performed a dataset partitioning into a training and testing sets according to a 70% and 30% proportions, respectively. Moreover, we include data under-sampling scheme in our experiments to examine whether the data balancing improves the classifiers' performances. Specifically, we randomly selected data from the majority classes to compensate for such imbalance. In this way, we evaluate the classifiers' performances based on 10-fold cross-validation procedure over two datasets: an imbalanced dataset, which maintains the original class distribution, and a balanced dataset, which contains approximately four thousand observations equally distributed among the classes. Finally, we assessed the classifiers based on the average of the accuracy, precision, recall, and F1-Score performance metrics. Next, we report and discuss the results obtained from such experimental procedure.

A. Classifiers' Assessment on the Imbalanced Dataset
TABLE Isummarizes the classifiers' performance scores on detecting the cyberbullying in an imbalanced dataset. The performance metrics correspond to the average and the standard deviation (in parentheses below the average score) of the accuracy, precision, recall, and F1-score, respectively, for the semantic (Stemming and Lemmatization) and syntactic (Bigrams and Tri-grams) data representations schemes combined with the TF-IDF.
In general, the results indicate that the classifiers using the semantic schemes to represent the textual data performed significantly better compared to their syntactic-based counterparts. We attribute this behavior to Spanish language properties, such as the use of proper nouns next to potentially relevant words. While semantic schemes for textual data representation consider the relevance of a word via its occurrence throughout the dataset, the syntactic schemes ponder the appearance of compositions of words, reducing their representatively in the dataset.
Further analysis of the classifiers' performances based on semantic schemes reveals that the stemming-based classifiers performed slightly better than lemmatization-based classifiers; these differences in the results are due to the feature extraction techniques principles. Whereas stemming removes affixes and suffixes to obtain word roots, lemmatization transforms words into their dictionary form, which turns the classification of textual data a challenging task, especially in languages with complex morphology [45].  Despite the unbalanced characteristic of the dataset, a classifier-based analysis exhibits the SVM and RF models as the best two performing classifiers, with small differences in scores for both classifiers. Regarding the average scores to all the metrics, the SVM obtained the best scores in most of the evaluated cases, whereas the RF obtained the lowest standard deviation values. We attribute these behaviors to the classifiers' training characteristics. On the one hand, the SVM classifier defines a decision surface based on the most representative samples within the training set, which battles the imbalance. On the other hand, the RF classifier bootstrap characteristic randomly selects a subset of training samples to build a tree within the forest; however, the training subsets are majority different among trees in the forest, thus overcoming the imbalance on the dataset.

B. Classifiers' Assessment on the Balanced Dataset
Similar to TABLE I, 0presents the classifiers' performance scores obtained from their execution on a balanced dataset. The results reinforce the classifiers' performance behavior elicited from the feature-based analysis on an imbalanced dataset.
In a classifier-based analysis, however, the results show that in general, the SVM classifier performed better than the rest of classifiers, closely followed by the RF classifier. In a classifier-based analysis, however, the results show that in general, the SVM classifier performed better than the rest of classifiers, closely followed by the RF classifier. We believe that this is due to the linear kernel used during the SVM training, which makes the SVM performs better in tasks with high-dimensional feature spaces [46], such text classification for cyber bullying detection.  In this work, we have proposed a machine learning classifiers' comparison to detect cyberbullying on Twitter posts written in the Peruvian Spanish language. The classifiers were trained upon a set of text messages collected from Twitter users resident in Peru. Moreover, the dataset content was validated by matter-related participants, i.e., psychologists, sociologists, among others, through a web application. We conducted experiments over imbalanced and balanced versions of the dataset using feature extraction schemes, which involve the combination of semantic and syntactic techniques from the Natural Language Processing field.

Accuracy
The experimental analysis demonstrated that semanticbased schemes for text representation are better than syntacticbased schemes. Moreover, classifiers working upon stemming features showed superior from those using lemmatization features. Furthermore, the Support Vector Machine classifier has shown a consistent performance among the feature extraction schemes despite the different performances showed by the classifiers in both datasets, obtaining superior results in the balanced dataset.
In our experiments, we relied on a pre-processing scheme based on traditional text processing techniques, such as the removal of repetitive characters, emoticons, stop words, and so on, to easy the classifiers' training. However, it would be interesting to assess the classifiers' performances over tweets that include emoticon characters as they are often used to reinforce emotions in text messages.
Finally, in this work, we have translated common jargons in Peruvian Spanish language to their dictionary equivalent, so to be part of the training process. However, it would be interesting to include jargon into a pre-defined Spanish language lexicon and assess the classifiers' performances