Inclusive Study of Fake News Detection for COVID-19 with New Dataset using Supervised Learning Algorithms

—Covid-19 imposes many bans and restrictions on news, individuals and teams, and thus social networks have become one of the most used platforms for sharing and destroying news, which can be either fake or true. Therefore, detecting fake news has become imperative and thus has drawn the attention of researchers to develop approaches for understanding and classifying news content. The focus was on the Twitter platform because it is one of the most used platforms for sharing and disseminating information among many organizations, personalities, news agencies, and satellite stations. In this research, we attempt to improve the detection process of fake news by employing supervised machine learning techniques on our newly developed dataset. Specifically, the proposed system categorizes fake news related to COVID-19 extracted from the Twitter platform using four machine learning-based models, including decision tree (DT), Naïve Bayes (NB), artificial neural network (ANN), and k-nearest neighbors (KNN) classifiers. Besides, the developed detection models were evaluated on our new dataset, which we extracted from Twitter in a real-time process using standard evaluation metrics such as detection accuracy (ACC), F1-score (FSC), the under the curve (AUC), and Matthew's correlation coefficient (MCC). In the first set of experiments which employ the full dataset (i.e., 14,000 tweets), our experimental evaluation reported that DT based detection model had achieved the highest detection performance scoring 99.0%, 96.0%, 98.0%, and 90.0% in ACC, FSC, AUC, and MCC, respectively. The second set of experiments employs the small dataset (i.e., 700 tweets); our experimental evaluation reported that DT based detection model had achieved the highest detection performance scoring 89.5%, 89.5%, 93.0%, and 80.0% in ACC, FSC, AUC, and MCC, respectively. The results obtained for all experiments have been generated for the best-selected features.


I. INTRODUCTION
Over the years, many researchers have tried to identify fake news spreading on social media. Fake news is a source of spam capable of influencing perception, knowledge, and measuring methods [1]. Fake news has the potential to reach individuals through social media, cause damage to the economy and manipulate political outcomes. Fake news can be described as misinformation directed to deceive people [2]. In recent years, fake news has been shared on various social media. Generate a health concern to obtain advertising revenue for financial or political gain. When a particular news story is published, supporters of the news tend to share complete information without any falsification. However, those whose opinions do not correspond to the mentioned information. They resort to sharing the same information with some modifications of their own. As a result, the distinction between real and fake news has gained the attention of organizations such as Facebook, Google, and Twitter. Many researchers are making sustained efforts to combat the spread of fake news. Understanding the language in news stories is difficult because different people understand language differently. That is why the same news can be considered real or fake by a different group of people. The spread of fake news on these platforms leads to a loss of credibility and financial loss.
In 2019, a new virus called Covid-19 was reported in Wuhan, China, and the Covid virus has spread to various other parts of the world and has killed many people. At first, it was claimed that it was transmitted from animals to humans. Research and various experiments to find an effective treatment for covid-19 has become a very urgent need. Covid-19 has opened the door to spreading false news on various social media platforms such as Twitter, Facebook, and Instagram, which has misled many users worldwide. Misleading information and news about the disease are shared on the Internet from various sources, some of which are not trusted. It is well known that spreading false news about Covid-19 on social media can contribute to stress and health anxiety and lead to serious consequences for society's awareness and reaction to vaccination against Covid-19, such as misinformation about false treatments, anti-vaccination propaganda, and theories of the plot.
With advancements in processing technology, machine learning models, and deep learning techniques, user intervention can be replaced by assigning pattern identification tasks to computers. On the other hand, very little research has been done on applying linguistic and deep learning techniques for accurate classification of fake news among research done; the accuracy achieved is so high. This paper discusses the classification of fake news related to Covid-19 using Machine Learning Algorithms (MLA) and will focus on the news spread on the social media platform Twitter. This is done by enhancing the process of detecting fake news using machine learning algorithms such as Decision Tree (DT), Naïve Bayes (NB), K-Nearest Neighbors (KNN)

A. Problem Statement
Fake news in people's lives is a spam source that can affect people's general perception and knowledge [1]. Fake news can be described as a specific type of misinformation sent and directed to deceive people [2]. In recent years, a new term in the scientific arena is (electronic flies), especially with the massive development of digital media and communications and the spread of social media platforms and their direct and frightening impact on the behavior of individuals. Especially when directing society to a studied destination by publishing news to serve a specific issue, which leads to new terms such as digital propaganda, digital war, digital armies, and electronic terms. [3]. When a global news story is published, news sites and organizations race to share their coverage and stories on social media. Proponents tend to share information in its complete authenticity without change. However, those whose opinions do not correspond to the published information and may materially harm them, resort to sharing the same information with some modifications of their own, which leads to the existence of complete real news and fake news, which leads to confusion in people's understanding of the truth. Hence the distinction between real and fake news has received considerable research interest. And influential organizations, such as Facebook, Google, and Twitter, are making sustained efforts to combat the spread of fake stories. Since the start of the COVID-19 crisis, much false news has spread rapidly on social media about the disease, its symptoms, and the number of infections, as well as fake news about vaccines and their side effects. Detecting and distinguishing between real and fake news has posed a challenge to researchers regarding the accuracy of the results, the speed of obtaining them, and the stability of the technique used.

B. Research Contribution
The main contribution of this research is proposing a model to detect fake news on a Twitter platform using MLA and meta-date (attributes for a Twitter account). The model used MLA to build the behavior of members of evaluation panels and to resolve the multiplexing between their judgments. This study contributed the following:  Collecting correct and fake tweets and corresponding metadata to create a dataset that will be publicly available for other researchers in the same field.
 Designing and developing an accurate model to detect fake Covid-19 news on Twitter using an MLA and Twitter's meta-data.

II. LITERATURE REVIEW
Due to the proliferation of large volumes of false content during the pandemic, the study around Covid-19-related misinformation became a popular area of research. Several methods were proposed to differentiate and verify the real and fake news for Covid-19 from different datasets and resources. Authors of [4] used deep learning algorithms in their study. The proposed model was based on the tweet's text and other tweet's features extracted online from Twitter, such as favorite count, retweet count, source, length, verified, the user URL, friend/followers count, statuses/followers count, and sentiment. The proposed method achieved an accuracy of 79% compared with SVM (72%) using Sheryl Mathias and Namrata Jagadeesh's dataset and the fake news data repository "FakeNewsNet." Recall reached 100% using RF, while the DT reached 94%. RF has 85% for the precision and 83% for the F1 score. In [5], the authors proposed a system for fake detection news consisting of two main categories: MLA and DNL. He used the FakeNewsNet dataset containing news content, social context and spatiotemporal and disasters, PolitiFact, and gossip police information to identify fake news on social media. The performance measure results were as follows: LSTM (Two layers) regarding disasters dataset (accuracy 98.6%, precision 98.55%, recall 98.6%, F1-score 98.5%). The Modified LSTM (one layer) obtained the best testing results: regarding the disasters dataset (accuracy 86.74%, precision 86.98%, recall 86.74%, F1-score 86.6%).
Regarding the PolitiFact dataset, the best testing results are obtained by the modified LSTM (two layers) (accuracy 83.93%, precision 86.66%, recall 83.93%, F1-score 83.31%). Regarding the gossip police dataset, and finally the Modified LSTM (one layer) regarding gossip police dataset (accuracy 83.82%, precision 84.85%, recall 83.82%, F1-score 83.7%). In [6], they applied several NNs, LSTMs, ensemble methods, and attention mechanisms to detect fake news on Twitter and other media platforms. Their models for fake news classification are based on the sentiment analysis of users in social media. They used the architectures to detect patterns in their data, where patterns can be anything such as unusual capitalization, random exclamations, question marks, etc. Various datasets were also used for evaluation, like the PolitiFact dataset, FakeNewsNet dataset, and twitter's advanced search functionality. The results showed that the LSTM achieved the highest accuracy: 88.78%. The detection performance was 73.29% in the CNN, 80.62% in the LSTM, 83.81% in the bidirectional LSTM, 88.78% in the CNN + Bidirectional LSTM, and 57.58% in logistic regression.
In [2], they proposed a Fake news tracker to identify false news and prevent propagation. Deep learning models were used to classify the encoder site consistent with deep LSTMs with two layers and 100 cells. The obtained accuracy on the PolitiFact dataset was 63.3% and 74.2% on the Buzzfeed dataset. In [7], the researchers used two MLAs: SVM and RF. They achieved the best result on SVM: precision of 50%, recall at 30%, and F1 score at 60%. On the other hand, RF achieved a precision of 88%, recall of 89%, and F1 score of 89%. In [8], the contributors used a supervised learning classification to train and test the manually and automatically annotated datasets to ensure annotation quality. The proposed method includes six different ML algorithms, four different features with each algorithm, and three pre-processing techniques. This method achieved: an 87.8% F1-score classification result with the manually annotated corpus, the automatically annotated corpus F1-score of 93.3%, and the highest precision value was obtained using the n-gram TF-IDF feature with the LR classifier (87.8%), finally LR classifier (93.4%) on manually and automatically annotated corpora.
On the other hand, in [9], the authors used six machine learning algorithms: NB, KNN, RF, C4.5, BN, and SVM. The www.ijacsa.thesai.org train is based on a 10-fold cross-validation model. This study created a dataset of tweets collected using Twitter's streaming API spanning three months. The average accuracy of the crossvalidation model for C4.5 was up to 98%, followed by RF, which had an average accuracy of 97.4%. The C4.5 outperformed all the other models. The Naive Bayes algorithm had the worst performance, with an average accuracy of 85.5%. In [10], they used three MLAs: NB, LR, and SVM, with two features: word embedding and word frequency approach. At the practical level, they collected one million Arabic tweets from the Twitter streaming API related to Covid-19. This study found that ML classifiers can correctly identify fake news-related tweets with an accuracy of 84%. In [11], the authors found that J48 has performed the best for the BuzzFeed Political News dataset with an accuracy of 0.655, while Classification via Clustering (CVC) has the worst accuracy of 0.501. For the Random Political News dataset, Sequential Minimal Optimization (SMO) algorithm has the highest value among the twenty-three algorithms, with an accuracy of 0.680.
In the same context, [12] discussed methods for detecting fake news using different sets of features extracted from the news text. One of the used feature sets was stylometric features, including the presence of uppercase letters and quoted content. Such features can be significant for detecting fake news and highlighting the importance of the writing style of news. Also, the write prints feature set extracted contains the content-specific, structural, linguistic, and syntax-based features. The model achieved an accuracy of 86% for stylometric features with a gradient boosting classifier. In [13], the authors used Bag-of-Words and TF-IDF, syntactic and semantic-based using Word2Vec and FastText. This method used two datasets for testing, and the results showed that the SVM model using TF-IDF obtained the best F1-Score value in both testing data. The model obtained an F1-Score of 92.21% in Testing Data 1 and 93.33% in Testing Data 2. In [14], the researchers tried to detect fake news using deep learning techniques such as LSTM, CNN, and BERT. The obtained accuracy results were LSTM 91%, CNN 93%, and BERT 98%. While in [15], they used four MLA classifiers: LR, SVM, DT, and Gradient Boost, to perform a binary classification to detect fake news and benchmark the annotated dataset. The proposed method curated and released a manually annotated dataset of 10,700 social media posts and articles concerning Covid-19 news, and it achieved the best performance of 93.32% F1-score with SVM.
Other noticeable models were found in [16][17][18][19][20]. In [16], machine learning was utilized to detect fake news published through social media such as Twitter and Facebook. The used ML algorithms were NB, SVM, BERT fine-tuning, and SBERT. The experiments found that SVM achieved the best results with F1 Validation of 93.28, compared to 90.62 using NB, 80.88 using BERT, and 78.18 using the SBERT technique. In [17], they proposed a detection method to distinguish and verify the fake news for Covid-19. This method achieved accuracy with the DT classifier at 92.07%, and the RF classifier accuracy achieved 94.49%. They proposed a model to classify news within different categories using SVM and TF-IDF. The classification precisions were 97.84% and 94.93% for BBC and 20 Newsgroup datasets. Also, in [19], the authors detect fake news in Covid-19 using a linear SVM, RF, LR, NB, and MLP. The evaluation was conducted using a large dataset containing 10,700 manually annotated social media posts and articles. The results showed that SVM achieved the best performance with 95.70 accuracies compared to others. SVM 95.7%, RF 90.79%, LR 95.42%, NB 93.32%, MLP 93.60%. In [20], they utilized an n-gram classifier to detect fake news. The TF-IDF feature extraction method estimated RF, DT, and SVM. This method achieved an accuracy of 0.73 for SVM and 0.78 for passive-aggressive.
Moreover, in [21], they used an n-gram classifier to detect fake news. SVM was estimated with the TF-IDF feature extraction method. The accuracy achieved 0.92. in [22], the authors used ten MLAs with seven feature extraction techniques to detect fake or real news. They tested their proposed classifier on 3,047,255 tweets concerning Covid-19. The best performance measures they achieved in NN, DT, and LR classifiers, were 99.7%, 99.9%, and 99.8%, respectively. In [23], they utilized two fundamental ML classification techniques within the meaning of text analytics. They identified common sentiments attached to the pandemic using the Coronavirus (COVID-19) Tweet and R analytical software. As Covid-19 approached the top level in the USA United States used clear textual analytics carried through needed text data visualization. The proposed method accuracy achieved 91% for long Tweets, including the Naïve, and an accuracy of 74% with a shorter tweets. While in [24], the study attempts to realize the rationale behind people's use of certain media, which was extended by an "altruism" motivation. The data were analyzed with Partial Least Squares (PLS) to determine the effects of six variables on the outcome of fake news. The researchers used Nigerian citizens as study samples, and the dataset contained 385 samples used in the experiments. The study showed that altruism is the most significant predictor of fake news sharing without using machine learning techniques. Furthermore, in [25], the researchers collected 2.7M posted by over 690k unique users. They noted that 18.66% of the tweets were posted by verified users (who constitute only 0.81% of the unique users). They collected 748k Arabic Language Tweets in addition to propagation networks of a subset of 65k Tweets to enable the research related to natural language processing, information retrieval, and social network analysis. This method used Twitter search API to retrieve the data daily between (January 27, 2020-March 31, 2020). The study did not use any MLA on the study and did not supply any results related to evaluation results. In addition, the collaborators of [26] collected a dataset containing 4072 news articles from Webhose.io regarding fake news about Covid-19. This method used linguistic features and conducted experiments with baseline classifiers, LSTM, and dense layer. The proposed method's accuracy was between 70% and 80%.
Eventually, by reviewing the literature, researchers focused on studying real/fake tweet detection using popular machine learning algorithms. Some researchers used DL and NLP to discover the nature of tweets. Researchers have achieved excellent results through machine learning algorithms (using natural languages). But natural languages differ in understanding from each other, so the published tweet/news may be true in a specific language and for people who know www.ijacsa.thesai.org the details of the language, while the tweet/news is misleading for people who do not understand the language in which the news/tweet is published. On the other hand, the results that can be obtained using (NLP) can be obtained similar results if using machine learning algorithms with (metadata) provided by Twitter API. Note that the authors of [4] have used common MLA and DL algorithms and reached excellent results using common machine learning algorithms and (metadata). From our point of view, I think using common machine learning algorithms is sufficient if their results are excellent compared to the results reached by researchers when using DL algorithms. The author will use them during this study and compare them with the results of [4]. In this study, a proposed model will be presented that uses machine learning algorithms and Twitter metadata to improve fake news detection and real news by identifying features that affect the accuracy of results.

III. MATERIALS AND METHOD
This section presents the research methodology and the steps that were followed to achieve the goal and objectives of this research. The proposed approach is decomposed into data collection, feature selection, machine learning implementation (classification), and metric evaluation. Fig. 1 summarizes the steps of the proposed system.

A. Dataset Collection
This study will collect a data set using the Twitter API. To use Twitter's metadata, the metadata will be used as features of the dataset. It is one of the most important contributions of this research study, as the data set available on the different platforms provides a data set consisting of the tweet and the status of the tweet only (0/1, true/false) and does not provide the metadata that we need for the study. To implement and train the proposed model. We need a labeled data set, which enables the data set (tweets) to be sent to medical bodies specializing in Covid-19 to determine the type of tweet that is healthy/false. The dataset usually contains various forms of text, numbers, and language combinations, as well as some retransmission hashtags or tags; our dataset is extracted from the social media giant (Twitter) and used to detect fake news from real news after selecting key features from the data descriptive and humane evaluation of the data set by staff with medical backgrounds to determine appropriate features subsequently.

B. Feature Selection
Feature selection is a good way to infer features with a strong and effective effect, which improves accuracy results. The algorithm's time is not wasted on non-valued features. Many feature selection methods are available in the literature based on the abundance of data with hundreds of variables leading to high dimensional data. Feature selection methods provide a way to improve prediction performance, reduce computation time, and better understand a data set in machine learning or pattern recognition applications [27]. We can define a feature as an individual measurable characteristic of the experimental process. Through a combination of features, any machine learning algorithm can perform classification. Also, feature selection aims to select a small subset of relevant features from the feature pool obtained by removing inappropriate, redundant, or worthless/annoying features. [28]. There are common search strategies to select features, such as Information gain using a univariate information filter class applicable to classification [29], Minimum redundancy and maximum relevance: using a multivariate information filter class applicable to classification [30], and Correlation: using univariate information filter class applicable to regression [31], Correlation-based feature selection (CFS): using multivariate information filter class applicable to classification, regression [31], Fisher score: using univariate information filter class applicable to classification [28], and Spectral feature selection (SPEC) and Laplacian Score (LS): using univariate information filter class applicable to classification [30]. Based on our study and experiments and using Correlation-based feature selection (CFS), we noticed that some features do not affect the accuracy of the results even if they are excluded. For example, the gender and nationality of a news writer do not affect human opinion when checking the authenticity of real news from fake news. In this study, the feature selection is based-on correlation and ranking, as will be explained with an example in the next section. We examined each feature with the target "class," recorded all results, and compared results to each other to select the best features and then used these features on our proposed model. Our dataset had thirty-five features before medical panel validation (which will be discussed later).

C. Machine Learning Implementation (Classification)
This study will collect a data set using the Twitter API. To use Twitter's metadata, the metadata will be used as features of the dataset. It is one of the most important contributions of this research study, as the data set available on the different platforms provides a data set consisting of the tweet and the status of the tweet only (0/1, true/false) and does not provide the metadata that we need for the study.

1) DT parameters:
 The minimum number of cases in papers where the algorithm will not create a division less than this limit, which would put less than the specified number of training examples in any branches.
 Split subset, Sub-division, where the algorithm is divided by a given number of instances.
 Tree depth limits the classification tree's depth to the specified node number.
 The majority (%): the algorithm depends on the division of the contract after reaching the specific majority threshold.
 Induce and build a binary tree (split into two child nodes).

2) KNN parameters:
 Distance Metric calculates the distance of 1 test observation from all other observations of the training dataset and then finds K nearest neighbors. To calculate the distance, we can use the following not exclusively: "Manhattan," which is the sum of all attributes' absolute differences, of all attributes, or "Mahalanobis," which is the distance between point and distribution. Or "Euclidean," which is the distance between two points, or "Chebyshev," which is the greatest of absolute differences between attributes.
 Weight: has two types: "Distance" is the closest neighbors of a query point have a greater influence than the neighbors further away, and "Uniform" is all points in each neighborhood are weighted equally. [32].

3) ANN parameters:
 Neurons are defined as the element that represents the number of neurons in the hidden layer. e.g., a neural network with three layers can be defined as 2, 3, 2.
 Activation is divided into "Logistic," the logistic sigmoid function. "Identity" is the no-op activation useful to implement linear bottleneck. "ReLu" is the rectified linear unit function. "Tanh" is the hyperbolic tan function.
 Regularization parameter alpha default value 0.0001.
 The solver for weight optimization contained "SGD "stochastic gradient descent.‖L-BFGS-B" is an optimizer in the family of quasi-Newton methods. "Adam" is a stochastic gradient-based optimizer that works relatively well with thousands of training samples or more in terms of training time and validation score. However, "L-BFGS-B" can converge faster and perform better.
 A Maximal number of iterations is 200 [32].

D. Evaluating Metrics
It is now well known that error rate is not an appropriate evaluation criterion when there are unequal costs. This paper uses F-measure and AUC (Area under the ROC Curve) as performance evaluation measures.
1) F1-measure is the mean of precision and recall. This takes the contribution of both, so the higher the score, the better, as shown in equation 2. The F1-measure is calculated by multiplying (Precision and Recall by 2) value divided by the total of precision and Recall. (1) 2) AUC has proved to be a reliable performance measure for imbalanced and cost-sensitive problems. Given a binary classification problem, a ROC curve depicts the performance of a method using the (FP, TP) pairs. FP is the false positive of the classifier, and TP is the true positive. AUC is the area below the curve [33]. The calculation for FP and TP is shown in equations three and 4. (2) 3) The confusion matrix is a table that illustrates and displays a performance rating model on a data set whose true values are already known. It is the best way to understand the behavior of the technique and algorithm used to show the statistics and the relationship between the expected results. As shown in Table I.

4)
Precision is an evaluation metric measuring the percentage of positive cases out of the expected positive cases. Equation 5 shows how to calculate Precision. [34]. (4)

5)
The recall is the part of the relevant documents that have been successfully retrieved. The recall is calculated as shown in equation 6 [34].

7)
Matthew's correlation coefficient (MCC) is a statistical measure of the strength of the relationship between the relative movements of two variables. The values range between "-1", "0" and "1" as the following explanation: Correlation of "0" shows no linear relationship between the movements of the two variables, the Correlation number is greater than "1" or less than "-1" means that there was an error in the correlation measurement, and Correlation of "-1" shows a perfect negative correlation, while a correlation of 1.0 shows a perfect positive correlation. [35]. Matthew's scale is associated with the F1 scale. As the F1 scale rises, the higher the Matthew scale rises. √ (7)

IV. RESULTS AND DISCUSSION
This section presents five stages of the experimental setup: experimental environment, dataset collection, cleaning dataset and pre-processing setup, dataset evaluation, result, and analysis (selection features and finally presents the accuracy results), recommendation, and future planning work.

A. Experimental Environment
As an intelligent adaptive approach, several phases describe the relationship between phases, where the outputs from a specific phase can be considered inputs for the following phase. In addition, moving to the next phase should ensure that the previous phase is completed. And to ensure that it will reach the best result, all ML algorithms and computational techniques were performed on a personal computer with an Intel(R) Core (TM) i5-2410M CPU @ 2.30GHz, with 8 GB of RAM and 512 GB SSD hard disk.

B. Dataset Collection
The dataset was obtained from the social media platform (Twitter) Orange App ver. 3.30.2 Using the text extraction extension, this program helps extract tweets easily by obtaining permission from Twitter to use its tweets in scientific research. Operations (correlation/ranking, classification, and all evaluation results) were performed through Python version 3.8. Orange is the software of a Python-based component. Visual programming software for data mining, machine learning, and data analysis. Data is presented visually, and App allows classification and clustering. Table II shows useful details for getting tweets using Orange App and Twitter Add-on; as we explained earlier, Orange was used as a program through which tweets are fetched using the secret key granted by Twitter.
We contacted Twitter to get API Key and developer account to search and collect real tweets from the original resources. This method also provided access to Twitter attributes used as features in this study. Table III displays a set of meta-data (Attributes) from the Twitter App using API and python command to get a user objects directory, known as Twitter's (metadata). For example (followers count, retweet count, likes, time, author-Verified, username, ID), this study focused on testing metadata as features to find out which (features) achieved the best Accuracy results F-measures, recall, Precision, MCC. After that, we collected 14,000 tweets, as shown in Table  IV, and a sample of 675 tweets was taken for the study. The dataset will be available to researchers for research purposes and research studies. The dataset has been cleaned and prepared to get features used in our proposed model and applied machine learning algorithms to it. To exclude useless features such as (date, id_str, id, entities, user, longitude, latitude, user_truncated, place, user_producted, user_description), which have frequent data values, and delete unwanted row heads from the dataset, an ML test must be implemented to dataset after cleaning.

C. Cleaning Dataset and Pre-processing Setup
The data cleaning process started by removing the empty rows and, incomprehensible symbols, useless attributes (date, time, language, latitude, longitude, in reply to, and location). Also, change feature values from Boolean to numeric (true/false  to 1/0).

D. Dataset Evaluation
After collecting and revising the dataset and having 675 tweets in finalizing step; the following steps were followed: www.ijacsa.thesai.org Some personal tweets were excluded (for example … Pfizer's second graft dose was taken, and I am on my way to take a booster dose of grafts...) to reach 542 tweets. For the next step. The medical panel validated the tweet content as fake/real news by humanly evaluating those with medical backgrounds by sending the dataset, which contained 542 tweets, to the medical panel. A data set consisting of two fields has been sent (Tweet, True/False rating) to get the medical opinion and evaluation tweets and select it as "Class" for the dataset and other attributes as features (Num-Likes, Num-Retweet, Author Followers Count, Author Listed Count, Author Favorites Count, Author Friends Count, Author Statuses Count and Author verified). The number 'True' will be selected for the real Tweet, and 'False' will be selected for the fake Tweet. The Tweet (content) will only be used to classify it, and then it will be excluded before it is entered into the classification algorithms; we exclude any feature that contains text such as the name of the Tweet author or the description of the Tweet author, and we convert the value (true/false) to "0" for the false tweet and "1" For the correct tweet, We combine it with the rest of the features (Num-Likes, Num-Retweet, Author Followers Count, Author Listed Count, Author Favorites Count, Author Friends Count, Author Statuses Count, and Author verified), to get the final dataset for the study.

E. Experimental Result Analysis
After collecting the datasets, author excluded the tweet from the dataset and applied the proposed model to test and train machine learning algorithms on it, as following steps: Importing Dataset: In this step, import and read the data set, using Python commands and show data set information, and repeat false tweets and is represented at zero number as well as the correct tweets and are represented in one number and all of which are represented by Class. This means that the data set is almost balanced.
Correlation: In this step, Correlation was used through Python commands to get the most useful features in terms of interconnection between them. The features that most affect the result are identified by Dataset Correlation, one of the most important commands in the Python library because it identifies features that affect the accuracy of the results when machine learning algorithms are applied to the dataset. Remember that all the features have been checked with the class feature, as we will explain in the next step. Fig. 2 shows the important and best features that affect the accuracy of the results. Feature selection: We applied the proposed form in two stages: The First Phase is to find the best class. Due to the number of tweets, we obtained in a huge data set (14000 tweets) and where the medical team could not be done; Because the huge number of tweets began to test selected features of metadata to be our class, and all the features were checked, and the class we reached was the author verification, several likes, and re-tweet as a "class." The proposed form was applied to the data set with the ML CALSIFIERS (DT, KNN, ANN, NB, LR, RF, SVM). The results were unreasonable and can arrive in workbooks (RF and DT) at 9.9%, and the difference between the right and false tweets and which were "0" and "1" (as a class), was not balanced. It is clear through the data set results that we will have to balance the data again, leading to an increase in samples that do not exist or excluding samples affecting the accuracy we will receive. After that, randomized random sampling method and random sampling method were used to balance our data set, but we found that the random sample with excess factors had disadvantages (have increased the sample with non-realistic values), and this led to incorrect resolution for results, in addition, The random sampling method also defects (sample is deleted that may contain data affecting resolutions)-the results as shown in Table V.
The second phase: A sample of 670 tweets was taken after cleaning the data set and reading Tweets by removing unnecessary or unnecessary tweets. Where the revised final data set reached 543, the authors are keen to be balanced as much as possible, avoiding the problem of non-balanced data. The data set was sent to the specialized medical authorities to evaluate Tweets. Then, the authors applied the proposed form to the data set. The authors have done the following: It should be noted that each feature was examined with target "Class" one-by-one consequentially and recorded the results to reach the best results, and through this process, it was found that (Num-Likes, Num-Retweet, Author Followers Count, Author Listed Count) are the most influenced result in the accuracy and improved results and found that features: Author Favorites Count, Author Friends Count, Author Statuses Count and Author verified had reduced the accuracy results as Table VI  and Table VII shown. However, the correlation and ranking method helped to find the best correlation features, which was achieved with our next step.
Training and Testing Dataset: The dataset was divided into 70% for training and 30% for testing using python; the Machine learning algorithms (DT, NB, KNN, NN) were used because it is the best classifier for (Binary dataset attributes) and also easy and fast classifiers. The parameters settings for Classifiers were as follows: . Classifiers have been applied using Crossvalidation by (3,5,10,20) folds. The best result was achieved after applying machine learning algorithms to the dataset using Cross-validation with 20 folds as follows: Decision Tree (DT) and Naïve Bayes (NB) achieved the highest value of Accuracy in Evaluation Results it was 89.5%, K-Nearest Neighbors (KNN) achieved 88.9% value of Accuracy in Evaluation Results, the Neural Network (NN) has 82.1% in Evaluation Result of Accuracy as shown in Table VIII.  On the other hand, this proposed model is a classification based on meta-data because of the advantages of using metadata (it can represent directly and easily). The results of testing classification algorithms plots for the AUC-ROC curve, as shown in Fig. 3. www.ijacsa.thesai.org The figure shows the ROC curves of the tested algorithms and plots the curve for each algorithm, which compares the classification models tested during the study. The curve shows a false positive rate on the x-axis (1-specificity; the probability that the target = 1 when the true value = 0) versus a true positive rate on the y-axis (sensitivity; the probability that the target = 1 when the true value = 1). The figure shows that the closer the model curve approaches the left boundary and then the upper bound of the ROC area, the higher the accuracy of the classifier/model. Due to the costs of false positives and negatives, the figure can determine the optimal classifier and the threshold for the Naïve Bayes classifier as shown in the figure and, therefore, the highest threshold. Area Under the Curve (AUC), It is clear from the figure that the AUC of the Naïve Bayes (NB) ROC curve is higher than the other classifiers of the KNN, NN, and DT ROC curve. Therefore, we can say that Naïve Bayes did a better job categorizing the positive category in the data set.

F. Discussion and Summary
This research has presented a proposed model to detect fake news on the Twitter platform using machine learning algorithms. The results obtained by applying AI algorithms to our selected features to detect fake news on the Twitter platform show the following results:  We found that when (listed count and followers count) increase, the value of the Target Class is -1‖ and if the count of (listed count and followers count) is less than 2000, the result of the target Class is "0", with considering the error rate. Table IX shows all results achieved in the Evaluation/confusion matrix and MCC Results. The results in Table IX show the following: KNN: The proposed model was able to find out the following:  Predict 65 truthful tweets and 81 false tweets.
 Failed to predict seven truthful tweets and nine false tweets.
 Predicting the correct news with a precision of 0.90.
 Predicting the incorrect one (Recall) of 0.87 MCC was 0.8, and this value near ( + 1) means perfect accuracy.
 The KNN Evaluation result was the best.
DT: The proposed model was able to find out the following:  Predict 63 truthful tweets and 81 false tweets.
 Failed to predict seven truthful tweets and 11 false tweets.
 Predicting the correct news with a precision of 0.90.
 Predicting the incorrect one (Recall) of 0.85  MCC was 0.7, and this value near (+1) means good accuracy.
NB: The proposed model was able to find out the following:  Predict 65 truthful tweets and 70 false tweets.
 Failed to predict 11 truthful tweets and nine false tweets.
 Predicting the correct news with a precision of 0.85.
 Predicting the incorrect one (Recall) of 0.87.
ANN: The proposed model was able to find out the following:  Predict 62 truthful tweets and 70 false tweets.
 Predicting the correct news with a precision of 0.77.
 Predicting the incorrect one (Recall) of 0.83.
 MCC was 0.6, and this value was greater than 0.5 and less than +1; this means the ANN achieved the worst accuracy in the proposed model.
In comparison with other approaches presented by other researchers, this approach presents the following: Mathias and Namrata Jagadeesh's dataset. Still, the researcher collected the dataset and labeled it as the proposed model needs, and the stages were unique, and hard to pick the right tweets.
Finally, the researcher concluded that DT is the best classifier to enhance the detection of fake news on Twitter. The best attributes to enhance the accuracy were author listed count, author follower count, number of tweets, and number of likes.