Bystander Detection: Automatic Labeling Techniques using Feature Selection and Machine Learning

—A hostile or aggressive behavior on an online platform by an individual or a group of people is termed as cyberbullying. A bystander is the one who sees or knows about such incidences of cyberbullying. A defender who intervenes can mitigate the impact of bullying, an instigator who accomplices the bully, can add to the victim’s suffering, and an impartial onlooker who remains neutral and observes the scenario without getting engaged. Studying the behavior of Bystanders role can help in shaping the scale and progression of bullying incidents. However, the lack of data hinders the research in this area. Recently, a dataset, CYBY23, of Twitter threads having main tweets and the replies of Bystanders was published on Kaggle in Oct 2023. The dataset has extracted features related to toxicity and sensitivity of the main tweets and reply tweets. The authors have got manual annotators to assign the labels of Bystanders’ roles. Manually labeling bystanders’ roles is a labor-intensive task which eventually raises the need to have an automatic labeling technique for identifying the Bystander role. In this work, we aim to suggest a machine-learning model with high efficiency for the automatic labeling of Bystanders. Initially, the dataset was re-sampled using SMOTE to make it a balanced dataset. Next, we experimented with 12 models using various feature engineering techniques. Best features were selected for further experimentation by removing highly correlated and less relevant features. The models were evaluated on the metrics of accuracy, precision, recall, and F1 score. We found that the Random Forest Classifier (RFC) model with a certain set of features is the highest scorer among all 12 models. The RFC model was further tested against various splits of training and test sets. The highest results were achieved using a training set of 85% and a test set of 15%, having 78.83% accuracy, 81.79% precision, 74.83% recall, and 79.45% F1 score. Automatic labeling proposed in this work, will help in scaling the dataset which will be useful for further studies related to cyberbullying.


I. INTRODUCTION
With the emergence of technology in this digital era the dynamics of human connection have changed.Social media platforms have evolved into incredible tools for connecting individuals from all over the world.However, some individuals use it positively while others engage in terrible conduct on social media.The destructive phenomenon of cyberbullying has emerged as a result of the rise of social media platforms [1].As our lives grow more entwined with the virtual domain, the frequency and consequences of cyberbullying have caught the interest of scholars, educators, and lawmakers.
Bullying is defined as a recurring pattern of hostile or aggressive behavior carried out by an individual or group that meets three criteria: repetition, intent to harm, and lack of authority [2].The major actors engaged in bullying irrespective of the circumstances in which it occurs are the perpetrator (bully), the victim, and bystanders.Bystanders in the cyberbullying landscape might be considered passive witnesses, which may involve strangers, who are often lured into the online chaos.They have the potential to either perpetuate or mitigate the trauma of victims.Bystanders have the potential to make a positive impact in bullying situations.Victims feel less worried and disappointed when they are surrounded by compassionate peers.Bystanders are present during bullying occurrences 80% of the time, and when they react, the bullying stops in 57% of cases within 10 seconds.
Statistics highlight a harsh reality, emphasizing the importance of acknowledging and addressing cyberbullying.According to recent surveys, an enormous percentage of people of different ages have been victims of internet abuse.Moreover, the findings provide a comprehensive picture, emphasizing the frequency of cyberbullying.Many studies use Twitter as one of the most popular data sources to identify cyberbullying as it is the most popular social networking site where cyberbullying is prevalent because of its constant conversation atmosphere which allows users to openly express their emotions, thoughts, and opinions [3].
Children and teenagers are more familiar with the internet nowadays than ever before, at younger ages.This pattern has given rise to a major concern of cyberbullying [4].Cyberbullying has a significant impact on victims both physically and psychologically.Bullying can cause depression, anxiety, loneliness, dejection, low self-esteem, anger, self-harming behavior, alcohol and drug usage, and engagement in violence or crime.Physical health suffers as well, resulting in headaches, sleeplessness, abdominal pain, food disorders, and nausea.Cyberbullying has also shown long-term effects on victims, causing stress, continuous misery, sleep difficulties, and even issues like hunger [5].

II. BACKGROUND AND RELATED WORK
To identify bullying, an annotation technique [6] was created to recognize textual aspects of cyberbullying, which includes posts by bullies and responses from victims and the audience.The fundamental goal of [6] research is to acquire an understanding of the language aspects of cyberbullying.This is accomplished in two stages by gathering and annotating a dataset.A harmfulness score is calculated for each post in the first phase to determine whether it is part of a cyberbullying incident.If that's the case, annotators divide the authors' roles into four categories: harasser, victim, bystander defender, and bystander assistant.A binary classifier for each fine-grained bullying category has been built by the end.Additional features like semantic information were not explored in this research.
The study discovered that the spread of hatred from the primary posts to the replies significantly impacts how annotators identify a thread, frequently leading to reclassification as bullying rather than plain aggression [7] [8].An examination of the entire thread assists annotators in understanding the intent behind the use of specific phrases, which may have different interpretations depending on the context [9].This finding is consistent with earlier research emphasizing the impact of bystander behavior in online environments.Bystanders' reactions are socially influenced and can be formed by their interactions with offensive comments, resulting in peer pressure and antisocial conduct.The study emphasizes the complex dynamics of online interactions, namely the involvement of bystanders in contributing to the overall classification of content as bullying.The study discovered that the spread of hatred from the primary posts to the replies significantly impacts how annotators identify a thread, frequently leading to reclassification as bullying rather than plain aggression.[7][8] An examination of the entire thread assists annotators in understanding the intent behind the use of specific phrases, which may have different interpretations depending on the context [9].This finding is consistent with earlier research emphasizing the impact of bystander behavior in online environments.Bystanders' reactions are socially influenced and can be formed by their interactions with offensive comments, resulting in peer pressure and antisocial conduct.The study emphasizes the complex dynamics of online interactions, namely the involvement of bystanders in contributing to the overall classification of content as bullying [7] [8].
The work done by [10] focuses on two objectives one is to detect cyberbullying as a binary classification problem and to detect participant roles as a multi-class classification problem.In simple terms, the focus is on evaluating the performance of models that could classify whether the post is cyberbullyingrelated and if it is the prediction of author's role is done.But there is a need for a more comprehensive and integrated approach that goes beyond individual posts to capture the dynamics of entire discussions in the context of cyberbullying.
While [11] contains two cyberbullying corpora in Dutch and English language.Both are manually annotated with bullying types and participant roles: harasser/bully -the individual who initiates the harassment, Victim -the one who is harassed, Bystander-Assistant: someone who assists the harasser.Bystander-defender:a person who supports the victim.This dataset has a serious problem of imbalance in the data.As "Bystander-Assistant" was the minority class, so the "Bystander-Assistant" was merged with the "Harasser" class to reduce the skew.However, there was still a large amount of imbalance between the "Harasser", "Victim" and "Defender" classes, and between "Bullying" and "No Bullying" in both English and Dutch Corpus which could negatively affect the machine learning corpus.Table II summarizes the related work in this area.
As concluded, there are many datasets available in the field of cyberbullying research on Twitter.Previous studies on cyberbullying detection as mentioned in Table I on Twitter relied on datasets labeled based on individual tweets, failing to capture the complexities of cyberbullying incidents.Labeling the roles of bystanders is a time-consuming job, especially when examining Twitter threads with a significant number of replies, as it demands a thread-by-thread approach thereby creating a need to automate the labeling techniques.
The uniqueness of the dataset [12], [13], [14] used in this research is the inclusion of labels for bystanders' roles and aggressiveness level of Cyberbullying.Many of the existing datasets solely focus on labeling the main post lacking information about the participants involved such as Bystanders.To the best of our knowledge, this dataset is different from the existing datasets.It contains 112 Twitter threads including the main post and the replies on that post totalling around 639 tweets.It also includes the primary tweets and bystander replies.These threads are grouped by conversation ID.By incorporating efficient machine learning models on this dataset better classification can be done leading to a deeper understanding of real-world scenarios [13], [14].
Through the Literature Survey, it can be said that there are not many Twitter datasets available where bystander roles in Cyberbullying are classified.The dataset used here [12], [13], [14] contains multiple types of Bystander roles such as defender, instigator, impartial, or other.It also consists of multi-class labels either as bullying with high aggression, bullying with low aggression, or aggression without indication of bullying.
The rest of the paper is organized as follows: Section II-A presents the motivation and objectives of the proposed work.Section III explains the methodology of the research.Experiments with results and their analysis are discussed in the Section IV followed by conclusions and suggestions for future work in Section V.

A. Motivation
The risk of cyberbullying is increasing year by year due to increased access to technology, low-cost internet connections, and the leaders enthusiastically pursuing and pushing the dream of "Digital India," making its assessment and prevention even more crucial.The vast majority of people now have access to the Internet.The children and teenagers are the most susceptible members, as they are driven into cyberspace before they are psychologically capable of making sense of it.According to Microsoft's Global Youth Online Behaviour Survey, India ranks third in cyberbullying, with 53% of respondents, primarily youngsters, admitting to have experienced online bullying, trailing only China and Singapore.Bystanders play an important role in dealing with cyberbullying situations where they can change the dynamics of relationships.They can respond in three ways: by replicating the perpetrator's toxic behavior, by interfering with the toxic talk and sticking up for the victim, or by just observing the unfolding events.However, the mechanisms of bystander behavior in cyberspace in response to hate speech are complex.This complication emerges because the existence of other internet users may reduce one's sense of obligation to interfere, expecting that someone else will do so.Bystanders in smaller groups, on the other hand, feel a larger need to intervene in cases of cyberbullying [17].
Most of the datasets that are available publicly do not emphasize any information related to the Bystander roles in Cyberbullying.Considering the effect of the bystanders, it is important to classify its role.The motive is to explore and potentially implement automatic labeling techniques for the dataset CYBY23 [12].The integration of automated labeling techniques in the dataset CYBY23 [12] helps to enhance the dataset's scalability and usability for future studies in cyberbullying research.The overarching goal is to contribute to the advancement of research in the field, offering insights that can foster a healthier online environment.

B. Objective
In this work, we aim to suggest a highly efficient technique for 1) Automated labeling of bystander roles in cyberbullying tweets.2) Finding out the most effective features extracted from the text of the tweets.
For the above objectives, we will deploy several machine learning models and experiment with various pre-processing, and feature selection techniques to discover the most efficient one among those.

III. METHODOLOGY AND PROPOSED MODEL
In this section, the methodology of our research work is described.Flow chart for the same is given in Fig. 1.The Major steps are listed below: 1) Data Ingestion: The dataset, CYBY23, was downloaded from the Kaggle website [13], [12].2) Data Pre-processing: Initially, the imbalance of the data was removed by using the SMOTE technique [18].Further, data was pre-processed to make it suitable for machine learning models.The features of the main tweet were augmented with those of reply tweets and some unwanted features were removed.Categorical features were converted to numeric values.3) Deployment of Machine Learning Models: Twelve machine-learning models [19] were deployed on the pre-processed data of Bystanders.The parameters of all the models were hypertuned to give their best performance.Pycaret library of Python1 was used for this purpose.The models were evaluated based on accuracy, precision, recall, and F1 score metrics.4) Experiments with Feature Selection: Next, various combinations of feature sets were experimented with like Toxicity features only (extracted from Perspective API2 ), Sensitivity features only (extracted from TextBlob3 ), and combinations of these features.Further, highly correlated features and less relevant features were removed to judge the performance of machine learning models.
Finally, a machine learning model having best accuracy and F1 score was recommended for automatic labeling of Bystanders role.The automation of Bystanders role detection will help in the early detection of cyberbullying cases and reduce their number to a greater extent.
Each of the steps involved in the process is explained below in detail:

A. Dataset Description
The dataset related to bystanders was downloaded from Kaggle [12].Alfurayj et al. [13] used Twitter API to extract 1024 tweets from January 2022 to January 2023.150 tweet threads were collected.Information such as the date of the tweet, tweet ID, screen name of the user and user ID associated with the tweet, number of likes & retweets, and text of the tweet was downloaded.Religion, ethnicity, sarcasm, and racial orientation were among the keywords and hashtags used to crawl this information, which could lead to harassment remarks.A manual annotation process for the labeling of Bystanders was used.Annotators followed the guidelines given in [20] and assessed the aggressiveness of individual tweets, identified bystander roles in replies, and made higherlevel judgments about the overall aggressiveness of the thread after considering the main post, replies, and bystander roles.Following the annotation process, threads lacking agreement from at least five annotators were eliminated, reducing the tweets to 639.The dataset, meeting the criteria for a good dataset, contained a minimum of 10% to 20% bullying cases, with cyberbullying with high aggression representing only 11.6%.Instigators were notably high in both bullying categories.The investigation focused on bystander contagion risk, with a higher prevalence of instigators associated with instances of bullying, as evidenced by the dataset.They realized the need for the automation of annotation for labeling of Bystanders' role because of the labor-intensive nature of manual annotation and hence a dataset, named CYBY23, was uploaded on the Kaggle website [12] for public use.CYBY23 dataset had the Twitter threads containing both the main posts and the replies from Bystanders.Each tweet had the text of the tweet along with certain general features of the tweets.Further, they extracted the Toxicity features using Perspective API and sentiment features using TextBlob for each tweet.There were 639 tweets in the dataset with the labels of bystanders' roles (manually annotated).
So, the dataset, CYBY23 [12], had six general features, namely, tweet id, reply id, text , created at, favorite count, retweet count for each tweet.Six features were derived from Perspective API , namely, Insult, Threat, Identity Attack , Profanity ,Toxicity , and Severe Toxicity, and three features were derived from TextBlob , namely, polarity, subjectivity, and sentiment.Feature 'class label' was assigned to the main tweet only and the feature 'bystander role label' was assigned to the reply tweet only.Thus, the dataset had sixteen features for main tweets and fifteen features for reply tweets.(see Table III).

B. Data Preprocessing
Certain pre-processing steps were applied to the CYBY23 dataset [12] before running the machine-learning models.Those are listed below: 1) The feature 'bystander role label' had four string values, namely, "This person agrees with the main post (instigator)", "This person disagrees with the main post (defender)", "This person is not taking any sides (impartial)" and "This person posted unrelated replies (Other)".These string values were converted to numeric values between 0 to 3. 2) To study the effect of the main tweet on the reply tweets, the features of the main tweet were concatenated with the features of the reply tweet, and a new dataset was created.The new dataset had seven general features, six toxicity-related features of the reply tweet and main tweet, three sentimentrelated features of the reply tweet and main tweet, feature 'class label' of the main tweet, and feature 'bystander role label' of the reply tweet.Thus, the new dataset had 28 features.Names of main tweet features were suffixed with main.Since main tweets were concatenated column-wise with reply tweet, so the number of total tweets reduced from 639 to 524.
3) The features tweet id, reply id, and created at were removed as they were not required for the models.So new dataset had 25 features for all the tweets.4) The feature 'text' was removed from the dataset, because toxicity features using Perspective API and sentiments features using TextBlob had already been computed from the 'text' feature.Thus, the new dataset had 24 features for all the tweets.
After pre-processing, we got the dataset having 524 tweets and 24 features for each tweet (see Table IV).Out of these 24 features, the feature 'bystander role label' was used as the target feature for all machine learning models.

C. Model Development
In this work, we deployed different machine learning models [19] using Pycaret library.A brief description of each of the models is given below: • AdaBoost Classifier (ADA): Adaptive Boosting Classifier is an ensemble classifier, that benefits from training several weak classifiers and then combining the result, with more weightage given to the classifier that gives more accuracy.
• Decision Tree Classifier (DT): A flowchart-like tree structure where each internal node denotes a test on an attribute, each branch represents the outcome of a test, and each leaf node holds a class label.
• Extra Trees Classifier (et): An ensemble machine learning method based on decision trees.The dataset sampling for each tree is done randomly, without replacement.The features subset is also assigned randomly to each tree.
• Gradient Boosting Classifier (GBC): This classifier is an additive model of decision trees and is often employed for both regression and classification tasks.
• K Neighbors Classifier(KNN): A learning method that uses the nearest neighbors to classify a data point.
• Linear Discriminant Analysis (LDA): A method used to find a linear combination of features that best separates two or more classes in a dataset.
• Light Gradient Boosting Machine (LGBM) & Extreme Gradient Boosting (EGB): Both are gradient boosting frameworks that use tree-based learning algorithms.They are recognized for their efficiency and predictive accuracy.
• Logistic Regression (LR): A foundational statistical method to model the probability of a certain class or event based on one or multiple predictor features.
• Naive Bayes (NB): A probabilistic classifier based on applying Bayes' theorem, it assumes independence between features.
• Random Forest Classifier(rf): An ensemble learning method that uses decision trees.Each decision tree comprises of dataset drawn by bootstrap sampling.The 'majority voting' is used to make final prediction.
• Ridge Classifier (RC): A classification algorithm that employs L2 regularization.It can help prevent overfitting and often delivers better performance in scenarios with multicollinearity.
• SVM -Linear Kernel(SVM): A learning method that finds a hyperplane to separate the two classes such that it maximizes predictive accuracy while avoiding over-fitting.

D. Model Validation
The proposed model was validated using various feature selection techniques: 1) Experimenting on various types of features (Toxicity Based, Sentiment Based) 2) Removal of Highly correlated features 3) Removal of less significant features 4) Hypertuning the parameters of machine learning models Model efficiency was analyzed after applying each of the techniques mentioned above.This section presents the experimental setup, their results, and analysis.

A. Platforms Used
We used Python using Jupyter Notebook and Google collaboratory for running the experiments.Pycaret library was used to run the machine learning models.Plotting of graphs was done using Matplotlib and Pandas library.

B. Dataset
A pre-processed dataset (see Table IV), having 524 tweets and 24 features for each tweet, was used in further experiments.

1) Handling Imbalance of Dataset:
The class distribution of the dataset having 524 tweets is shown in Fig. 2 (a).High imbalance can be observed in the number of instances of unique values of the target feature 'bystander role label'.Imbalance can be handled by undersampling or oversampling the minority class.However, undersampling has the chance of losing important information.So, we used an oversampling technique, Synthetic Minority Oversampling Technique (SMOTE) [18] to handle the imbalance.SMOTE generates synthetic samples for the minority class and creates a balanced dataset.Fig. 2 (b) depicts the balanced dataset with 912 data points.

C. Model Deployment using Various Feature Selection Techniques
We experimented with different feature selection techniques on various machine learning models.Pycaret was used to run all the models.The models were evaluated using accuracy, precision, recall, and F1 score metrics.The results of running all machine learning models using Pycaret are shown in Table V.The experiments and their results are mentioned below: • Case 1: Initially we run the experiments using only the toxicity features derived from Perspective API.Random Forest Classifier(rf), Gradient Boosting Classifier (gbc), Light Gradient Boosting Machine (lightgbm), and Extra Trees Classifier (et) performed best, each with accuracy as well as F1 score of 72% (approx).
• Case 3: Next, we experimented with both the toxicity features (mentioned in case 1) and sentiments features (mentioned in case 2).With this feature set, accuracy as well as F1 score of approx.75% was achieved with all the four classifiers mentioned in Case 1 and Case 2. Thus, indicating that instead of using only Toxicity or Sentiment features, results are better when both are used.
• Case 4: From the feature set mentioned in case 3, we computed the correlation coefficient among features (see Fig. 3).We found that the feature Severe Toxicity main is highly correlated to Toxicity main.Similarly, the features Profanity and Toxicity, favorite count main and retweetcount main, Toxicity and Insult, Severe Toxicity and Profanity, Toxicity main and Insult main are highly correlated.Thus, we removed the features, 'Severe Toxicity main', 'Profanity', 'Toxicity', 'favorite count main', 'Toxicity main, and were left with 19 features.After removing the correlated features, the highest accuracy of 76% was achieved.Again, the same four classifiers, rf, gbc, lightgbm, and et, performed best.• Case 5: Next, we experimented with finding the importance of the features mentioned in case 3. Some feature ClassLabel main, sentiment, sentiment main, and retweet count were ruled out (see Fig. 4) because of their low importance.
After removing the less important feature, we checked the efficiency of our models (see Table V).Random Forest Classifier(rf) performed best with 76% accuracy and 78% F1 score.
• Case 6: Further, we chose a feature set that was formed after removing the highly correlated features as well as the less important features from the features given in Case 3. Running all the machine learning models using Pycaret gave the results mentioned in Table V.We observe that Random Forest Classifier(rf) again performed best with 77.6% accuracy and 79.8% F1 score.
Fig. 5 compares the accuracy of all the classifier models for each of the feature set case discussed above.For each of the feature set case discussed above, Fig. 6 compares the accuracy achieved by the different classifier models.Here, results from only those classifier are plotted, which achieved more than 50% accuracy.
As is discussed above for all the six cases and is evident from Fig. 5 and Fig. 6, Random Forest Classifier(rf) performs best for most of cases.Also, the best result is achieved by the feature set formed by including both the Toxicity and Sentiments features and by removing both the least significant features and the highly correlated features.

D. Feature Importance
Before going for further experimentation, we would like to give some observations related to the importance of features as depicted in Fig. 4. 1) We found that the features ClassLabel main, sentiment, sentiment main, and retweet count have very less importance as compared to other features (see Fig. 4).This indicates that the level of aggression of the whole thread denoted by ClassLabel main has little impact on the model performance.The sentiment of the reply tweet and the sentiment of the main tweet has very little role to play along with the number of retweets indicated by retweet count.2) Features Insult and Toxicity have the highest importance.One of them can be considered an important feature since they are highly correlated.3) Feature Threat of the main tweet and reply tweet is almost equally important.4) Features Identity Attack, Profanity, Insult, Toxicity, Severe Toxicity, Polarity, Sentiment of the main tweet have low importance as compared to the corresponding features of the reply tweet except for the Threat and Subjectivity feature.5) Comparing the set of features based on Perspective API and TextBlob4 , we can observe that features based on Perspective API have more importance.
Summarizing the observations, the top features among all are Toxicity, Identity Attack, Threat main, Profanity, and Threat.

E. Different Train-test Split
Then we experimented with different train-test splits for judging the performance of Random Forest classifier.The data

Fig. 6 .
Fig. 6.Comparison of accuracy for different feature sets.

TABLE I .
PUBLICLY AVAILABLE DATASETS FOR CYBERBULLYING

TABLE II .
CYBERBULLYING DETECTION, AND BULLYING TYPES

TABLE III .
FEATURES OF ORIGINAL DATASET CYBY23

TABLE IV .
FEATURES OF PRE-PROCESSED DATASET