A Survey of Spam Detection Methods on Twitter

Twitter is one of the most popular social media platforms that has 313 million monthly active users which post 500 million tweets per day. This popularity attracts the attention of spammers who use Twitter for their malicious aims such as phishing legitimate users or spreading malicious software and advertises through URLs shared within tweets, aggressively follow/unfollow legitimate users and hijack trending topics to attract their attention, propagating pornography. In August of 2014, Twitter revealed that 8.5% of its monthly active users which equals approximately 23 million users have automatically contacted their servers for regular updates. Thus, detecting and filtering spammers from legitimate users are mandatory in order to provide a spam-free environment in Twitter. In this paper, features of Twitter spam detection presented with discussing their effectiveness. Also, Twitter spam detection methods are categorized and discussed with their pros and cons. The outdated features of Twitter which are commonly used by Twitter spam detection approaches are highlighted. Some new features of Twitter which, to the best of our knowledge, have not been mentioned by any other works are also presented. Keywords—Twitter spam; spam detection; spam filtering;


INTRODUCTION
Twitter is one of the most popular social media platforms which provide a social network of users post messages up to 140 characters called as "tweet".Twitter lets users share their messages about everything related to the real life including news, events, celebrities, politics [1][2][3][4][5].According to Twitter, Twitter has 313 million monthly active users that post 500 million tweets per day which equal 350,000 tweets per minute [6][7][8].Thanks to this huge social network, users are able to stay connected with the topics they are interested in.Twitter provides a list of most talked topics at a given point in time called "Trending Topics (TT)" to let users be aware of most popular topics on Twitter."Hashtag" is a term which starts with "#" character is commonly used to mention the topic of the tweet and let users track the topics they are interested in [9].Thanks to its popularity and design, Twitter immediately reflects noteworthy events in real-time.This structure of Twitter lets real-time search systems and meme-tracking services mine real-time tweets to find out what is happening in the world with minimum delay [10,11].Sentiment analyzing services are able to make a conclusion about topics in Twitter which turns Twitter into a real-time poll system [12][13][14][15][16].The success of those services completely relies on filtering spammers from legitimate users.Consumers tend to use Twitter to learn ideas of others about the products they are going to buy.Similarly, companies use Twitter to measure the satisfaction of their customers for their products [17][18][19][20][21].However, this popularity and practicalness also attract the attention of spammers.In April of 2014, Twitter was flooded by an avalanche of malicious tweets that were sent by thousands of compromised user accounts [22].In August of 2014, Twitter revealed that 8.5% of its monthly active users which equals approximately 23 million users have automatically contacted their servers for regular updates [23,24].A report shows that 83% users of social networks have received at least one unwanted friend request or message [25].Most common definition of spam is unsolicited one [26][27][28].Spammers share links within their tweets in order to spread advertise to generate sales, propagate pornography, share malicious links which direct users to malicious software, hijack trending topics for their purposes, abuses reply or mention functions to post unsolicited messages to legitimate users to attract their attention, and phish legitimate users [1,21,[28][29][30][31][32][33][34][35][36][37].According to the report by statista, 80% of Twitter users access Twitter via their mobile devices [38].Thus, users who access Twitter via their mobile devices should more care about spam than the users who access Twitter via web browsers since it may (1) collect excessive amount of personal information such as user location, call history, SMS, bank account details, calendar events, (2) access the data located in the device's memory or SD card, (3) send premium-rate SMS messages, (4) capture key-strokes by key logging, (5) make calls, and (6) detect user's location via Internet or GPS and share [39][40][41][42][43][44][45].Another issue with users of social media is that according to the reports, users of social media do not show an adequate understanding of the threats of social media as much as they are on other platforms.Bilge et al. [46] report that 45% of users on a social media platform readily click on links posted by their "friends", even though they may not know that person in real life.Content-filtering approaches are not effective for Twitter since spammers tend to share shorten URLs in order to (1) overcome the character limitation defined by Twitter, and (2) manipulate spam filtering methods based on URL blacklisting [28,36,[47][48][49][50][51][52].The major contributions of this paper are given as follows:  Features of Twitter which can be used to detect spam are presented with discussing their effectiveness,  A comprehensive review of Twitter spam detection methods are discussed with considering their pros and cons in order to give a clear idea to the researchers who are interested in spam detection in Twitter,  The new features of Twitter which, to the best of our knowledge, have not been used by any spam detection www.ijacsa.thesai.orgapproaches yet that can be used to detect spam are presented,  The outdated features of Twitter which are commonly used by spam detection approaches in literature are presented.
The rest of the paper is structured as follows: Section 2 describes the background including features of Twitter and how Twitter deals with spam.Section 3 presents the features of Twitter spam detection.Section 4 presents the Twitter spam detection methods.Section 5 presents discussion.Finally, Section 6 concludes the paper.

II. BACKGROUND
In this section, features of Twitter and the way Twitter deals with spam are presented.

A. Features of Twitter
Twitter lets accounts to "follow" other accounts which they are interested in.Unlike other social media platforms, the relationship between users is bi-directional instead of unidirectional links which mean one user may not be following one of his followers.The user can "like" or "retweet (RT)" a tweet which means sharing that tweet with his "followers".The relationship between users in Twitter is presented in Fig. 1.Each user has a unique Twitter username, and users can post tweets that refer others by adding their usernames with starting "@" character which is called as "mention" on Twitter.Users are immediately informed with notifications when a mention, like, or RT happens to one of his tweets.Another feature of Twitter is letting users create user public or private lists in order to organize their interests by grouping users whose interests are same or similar [53][54][55].Similarly, it is possible to manage lists by adding users to the lists or removing users from the lists which the user is the owner of.The lists the user subscribed are categorized as "subscribed to" while the lists the user is added by their owners are categorized as "member of" which are presented in Fig. 2.

B. How Twitter Deals with Spam
Twitter uses both manual and automated services to compete spammers in order to provide a spam-free environment.The manual way is that Twitter lets users report spammers through the spammers' profile pages.Twitter provides a user interface as it is presented in Fig. 3 to report the account by selecting the reason.Another way which is commonly reported in the literature is mentioning spammers to the official "@spam" account [28,29,37,[56][57][58] but according to the recent report by Twitter, this method of reporting spam is outdated [30].Also, Wang reports that this method is abused by both hoaxes and spam [29].These manual approaches are labor-intensive and would not be enough to detect all spammers considering billions of users.Twitter uses various factors such as (1) posting duplicate messages over multiple accounts or multiple duplicate messages on one account, (2) following/unfollowing large number of accounts in a short time period, (3) having large number of spam complaints filed against the account, (4) aggressively liking, following, and retweeting, (5) posting malicious links, (6) posting tweets which mainly consist of links instead of also posting personal updates, and (7) posting unrelated tweets to a trending topic to determine what conduct is considered to be spamming [59].

A. Account-based Features
Spammers can be detected by analyzing their Twitter accounts which contain the features listed in Table 1.Since some of these features such as biography, location, homepage, and creation date are user-controlled, they are useless in term of spam detection www.ijacsa.thesai.org

Username
The unique identifier of the account Yes

Biography
The biography of the account Yes

Profile photo
The profile photo of the account Yes

Header photo
The header photo of the account which is displayed at the top of the profile

Theme color
The theme color choice of the account Yes

Birth date
The birth date information of the account

Homepage
The website of the account Yes

Location
The location of the account Yes

Creation date
The date the account is created Yes

Number of tweets Total number of tweets the account has
No

Number of likes
Total number of likes the account's tweets have

Number of retweets
Total number of retweets the account's tweets have

Number of lists
Total number of lists the account has Yes

Yes
When the behaviors of spammers are analyzed within the scope of account-based features, these facts are observed:  Since spammers tend to follow too many legitimate accounts in order to attract attention, the number of following is expected to be high compared to legitimate users.
 Since spammers are not followed by legitimate users, the number of followers is expected to be less compared to legitimate users.
 Since spammers' tweets are unsolicited, the number of likes and retweets for their tweets are expected to be less compared to legitimate users.
 Since spammers tend to post lots of tweets to attract the attention of legitimate users, the number of tweets sent by the account is expected to be high compared to legitimate users.
 Spammers' tweets mostly contain links and hashtags to attract the attention of legitimate users.
 Since spammers' tweets are ignored by legitimate users, the number of replies and mentions spammers get are expected to be low compared to legitimate users.
 Spammers tend to post same or similar tweets which are posted by one or more controlled accounts.
 Legitimate users tend to be added to the lists unlike spammers unless bots under the command and control (C&C) architecture add them to the lists they intentionally created in order to manipulate spam detection approaches.

B. Tweet-based Features
Spammers tend to post lots of unsolicited tweets to legitimate users to attract attention.Spammers can be detected by analyzing their tweets.This is necessary to filter spam tweets from legitimate ones and provide users a spam-free environment which is the aim of Twitter [60].Each tweet contains the information listed in Table 2.

Sender
The sender of the tweet Yes

Mentions
The mention(s) used in the tweet Yes

Hashtags
The hashtag(s) used in the tweet Yes

Link
The link used in the tweet Yes

Number of retweets
The number of retweets the tweet has No

Number of replies
The number of replies the tweet has received

Sent date
The date tweet is sent Yes

Location
The detected location of the place the tweet is posted

Yes
When the behaviors of spammers are analyzed within the scope of tweet-based features, these facts are observed:  Spammers tend to use links to direct legitimate users to their malicious purposes.
 Spammers tend to use lots of mentions to attract the attention of more legitimate users.
 Spammers tend to use lots of hashtags (especially the trending ones) to reach more users.
 Since spammers' tweets are unsolicited, the number of likes and retweets their tweets have received are much lower compared to legitimate users.

C. Graph-based Features
Twitter is a network of users with relationships between them and tweets.This structure can be represented as a graph.For the graph model, users and tweet can be represented as nodes and relationships can be represented links between nodes.These relationships show how the tweet's sender and mentions are connected to each other.Also, these relationships are clear indicators of legitimate conversations.By constructing a graph model to represent users and their relationships, the distance between the tweet's sender and mentions can be calculated for spam analysis.Graph-based features are listed in Table 3. www.ijacsa.thesai.org

Distance
The length of the shortest path between users No

Connectivity The strength of the connection No
When the behaviors of spammers are analyzed within the scope of graph-based features, these facts are observed:  The distance between a spammer and a legitimate user is further than the distance between two legitimate users.
 The connectivity between a spammer and a legitimate user is more robust than the connectivity between two legitimate users.
 Graph-based features provide the most robust performance to detect spam and spammers since they are hard to manipulate and not user-controlled.

IV. TWITTER SPAM DETECTION METHODS
In this section, Twitter spam detection methods in literature are presented and discussed.The proposed methods are categorized as follows: (1) Account-based spam detection methods, (2) tweet-based spam detection methods, (3) graphbased spam detection methods, and (4) hybrid spam detection methods.

A. Account-based Spam Detectıon Methods
Account-based spam detection methods are based on the features (or a combination of them) of Twitter account which are listed in Table 1. Lee et al. [61] propose a honeypotbased approach to detect spam in social media platforms.The features they consider detecting spam are the longevity of the account on Twitter, the average tweets per day, the ratio of the number of following and number of followers, the percentage of bi-directional friends, the ratio of the number of URLs in the 20 most recently posted tweets, the ratio of number of unique URLs in the 20 most recently posted tweets, the ratio of the number of usernames in the 20 most recently posted tweets, and the ratio of the number of unique usernames in the 20 most recently posted tweets.Lin and Huang [62] propose a method to detect spam in Twitter on the basis of two features: (1) URL rate which defines the ratio of the number of tweets with URL in the total number of tweets, and (2) interaction rate which defines the ratio of the number of tweets interacting over the total number of tweets.Gee and Hakson [58] propose a method based on account-based features such as followers-to-following ratio, the number of tweets to account lifetime ratio, the average time between posts, posting time variation, max idle hours, and link fraction.The limitation of this work is that they utilize the manual way of reporting spam in Twitter which is outdated as it is discussed before.Many Twitter spam detection methods use account-based features but alongside with other spam detection features in order to provide more robust spam detection methods which are called as "hybrid" spam detection methods in this paper.

B. Tweet-based Spam Detection Methods
Tweet-based spam detection methods are based on the features (or combinations of them) of a tweet which are listed in Table 2. URL filtering approaches use static or dynamic crawlers to investigate newly observed URLs.Also, they use URL or domain blacklisting in order to detect suspicious URLs from a knowledge base.These approaches use several features such as URL and DNS information, URL redirections, and the landing website's source code (HTML).McGrath and Gupta [47] present a phishing detection method based on lexical features of an URL.The features they consider detecting phishing are the length of URL and the domain name, the character composition of the domain name, the presence of brands in URLs, and misuse of URL-aliasing and free web hosting services.Ma et al. [63] propose a method to detect malicious websites by analyzing their URLs.The features they use detecting malicious websites contain WHOIS properties such as who is the registrar of the website, who is the registrant of the website, when the website is registered, domain name properties such as the time-to-live (TTL) value for DNS records, and geographic properties such as in which country does the IP address belong, the speed of the uplink connection alongside lexical features of URL.Prophiler [64] is a filter that uses static analysis techniques to detect the malicious content of a website.The features Prophiler considers are derived from (1) the HTML content of the website such as the number of elements with small area, the number of elements contain suspicious content, the number of included URLs, and the number of known malicious patterns, (2) the associated JavaScript code such as keywords-to-words ratio, the number of long strings presence of decoding routines, probability of shellcode presence, and the number of DOM-modifying function, and (3) the corresponding URL such as the number of suspicious URL patterns, presence of subdomains or IP addresses in URLs, and the TTL value for DNS A and NS record.Since Prophiler uses static analysis techniques, it is not able to detect malicious URLs embedded into dynamic content such as part of JavaScript which is currently the most commonly used programming language [65,66], Flash, and Java applets.Methods based on dynamic analysis techniques [67][68][69][70] use virtual machines and automated web browsers such as Selenium for in-depth content analysis.Chhabra et al. [49] present a phishing detection method based on URL analysis.Their method is specially designed to be able to analyze shortened URLs which are commonly used in Twitter to manipulate spam tweets as it is discussed before.The features the proposed method use detecting phishing through an URL are the number of clicks, geographical spread, temporal spread, and web popularity.WarningBird [71] is a suspicious URL detection system for Twitter which investigates correlations of URL redirect chains.WarningBird uses 14 features to detect suspicious URL such as the length of URL redirect, the number of different landing URLs, the relative number of different Twitter accounts, the similarity in the account creation dates, the similarity in the number of followers and following, the similarity in the followerfollowing ratio, and the similarity of tweets.Martinez-Romo www.ijacsa.thesai.organd Ajauro [72] propose a tweet-based spam detection method which focuses on the analysis of the language used in tweets.Specifically, the language models they use are (1) the language model of the tweets related to a trending topic, (2) the language model of the tweet, and (3) the language model of the page linked by the tweet.Similar to the account-based spam detection methods, many Twitter spam detection methods use tweet-based features alongside with other spam detection features in order to provide more robust spam detection.

C. Graph-based Spam Detection Methods
Graph-based spam detection methods are based on the features (or combinations of them) of a tweet which are listed in Table 2. Song et al. [28] extract the distance and connectivity between the tweet's sender and mentions.While distance defines the length of the shortest path between the tweet's sender and mentions, connection defines the strength of the connection between users.Graph-based spam detection methods use graph data structures to model features of Twitter as nodes and edges.Graph data models are the perfect solution to represent the data where information about data interconnectivity or topology is at least as important as the data itself [73].Thus, graphs are commonly used by social networks such as Facebook, Twitter [74][75][76][77][78][79][80][81] which are mostly built on users, topics, and bi-directional interactions.Despite that graph-based features provide the best performance in terms of accuracy and sensitivity to differentiate spammers from legitimate users, other graph-based spam detection methods are presented in hybrid spam detection methods since they are combined with other spam detection methods.

D. Hybrid Spam Detection Methods
Hybrid spam detection methods use a combination of spam detection methods described in previous subsections in order to provide more robust spam detection which investigates the possibility of spam in a more comprehensive way.Stringing et al. [51] propose an approach based on both account-based and tweet-based features which are the ratio of the number of friend requests that the user sent to the number of friends she has, the ratio of the number of tweets which contain URLs to the total number of tweets the user has, the similarity of tweets sent by the user, the number of tweets sent by the user, the number of friends the user has, and the possibility of whether an account likely used a list of names to pick its friends or not.Gao et al. [82] propose a tweet-based spam detection approach based on the social degree of the tweet's sender, the history of interaction, the size of the cluster, the average time interval, the average number of URL in tweets, and the unique number of URL in tweets.Chen et al. [83] present a real-time spam detection method for Twitter based on 12 lightweight features which are extracted from a dataset contains 6.5 million spam tweets.The features they consider detecting spam on Twitter are age of the account, the number of followers, the number of following, the number of likes the account received, the number of the account's lists, the number of tweets of the account, the number of retweets of the tweet, the number of hashtags used in the tweet, the number of mentioned users in the tweet, the number of URLs used in the tweet, the number of characters used in the tweet, and the number of digits used in the tweet.Wang [29] proposes a hybrid Twitter spam detection method based on graph-based and tweet-based features.The graph-based features considered in the proposed method are the number of followers, the number of following, a reputation score which is calculated as the ratio between the number of followers over the total sum of the number of followers and following, and the number of following.The tweet-based features considered in the proposed method are tweet similarity, the number of tweets which contain URLs in the most recent 20 tweets, the number of tweets contains mentions in the most recent 20 tweets, and the number of tweets contains hashtags.Yang et al. [84] propose a Twitter spam detection method based on a combination of graph-based, tweet-based, and account-based features.The proposed method uses more robust features including the number of bidirectional links, the ratio of bi-directional links, betweenness centrality, clustering coefficient alongside tweet-based and account-based features such as the number of followers, the number of following, the number of tweets sent by the account, the age of the account, the ratio of the number of tweets contain URL, the ratio of the number of tweets contain hashtags, the number of duplicate tweets, the ratio of spam word, the ratio of the number of tweets used to reply to others, and the ratio of the number of retweets.Benevenuto et al. [1] propose a hybrid spam detection method based on accountbased features such as the number of followers, the number of following, the ratio between followers over following, the number of tweets sent by the account, the number of mentions the account received, the number of replies, and the ratio of tweets received from the account's followers.The tweet-based based features of the proposed method are the number of words in each tweet, the number of URLs per word, the number of words of each tweet, the number of characters of each tweet, the number of hashtags on each tweet, the number of mentions on each tweet, the number of URLs of each tweet, and the number of times the tweet is retweeted.Chu et al. [48] present a method to categorize Twitter accounts as human, bot, and cyborg which is based on both account-based and tweet-based features.The features they consider categorizing the Twitter account into human, bot or cyborg are the number of the ratio of tweets contain URLs, device makeup, the number of the ratio of followers to friends, link safety, and whether the account is verified.Amleshwaram et al. [85] propose a hybrid Twitter spam detection method based on both account-based and tweet-based features.They categorize spammers into two: (1) users centric, and (2) URL-centric.The features they consider for spam analysis are the number of unique mentions, unsolicited mentions, hijacking trends, intersection with famous trends, variance in tweet intervals (VaTi), variance in number of tweets per unit time (VaTw), ratio of VaTi and VaTw, tweet sources, duplicate URLs, duplicate domain names, IP/domain fluxing, tweet's language dissimilarity, similarity between tweets, URL and tweet similarity, followers-to-following ratio, and profile description's language dissimilarity.Chakraborty et al. [86] propose a hybrid method based on account-based and tweet-based features which use some new features such as spam score of profile description, name, and screen name, presence or absence of profile image and average same hashtag count.McCord and Chuah [9] present a hybrid method based on account-based and tweetbased features to facilitate spam detection.The features they use in the proposed method are the distribution of tweets over a www.ijacsa.thesai.org24-hour period, the number of URLs, the total number of replies/mentions in the most 100 recent tweets, the number of retweets in the 20-100 most recent tweets, the total number of hashtags in the 100 most recent tweets.Wang et al. [87] propose a spam detection method based on account-based, tweet-based, natural language processing (NLP), and sentiment features.Some unique features they use while detecting spam are length of the profile name, automatically or manually created sentiment lexicons, the number of exclamation marks, the number of question marks, maximum word length, mean word length, the number of capitalization words, the number of white spaces, and part of speech (POS) tags per tweet.Outline of the related works including their methodologies, the categories their metrics are based on, and accuracies are listed in Table 4.

V. DISCUSSION
Spam detection in Twitter needs different ways from traditional spam detection methods for email and the web since (1) spammers tend to use shortened URLs instead of the full form of URL, and (2) Twitter is based on a huge and detailed network which is built on tweets, accounts, lists, moments, and the relationships between them.Thus, a more robust approach is required to detect spam in Twitter to with considering the variety of legitimate users who may behave similarly to spammers under certain circumstances.Even Twitter itself has false positive (spammers which are classified as legitimate users) detections as it is reported that Twitter has recommended a legitimate user to follow bots instead of related accounts [88].In this paper, the features of Twitter spam detection are presented with discussing their effectiveness in detecting spam.Then, the proposed works in literature are categorized into four: (1) Account-based, (2) tweet-based, (3) graph-based, and (4) hybrid spam detection methods which use a combination of others.
Methods based on account-based features analyze account by using features related with accounts which some of them can be manipulated by spammers such as the number of following, the number of tweets sent by the account, the number of lists created by the account, the number of moments created by the account which is a brand new feature and, to the best of our knowledge, it has not been used by any works in www.ijacsa.thesai.orgliterature yet [89][90][91], the number of mentions the account received, the number of likes received by the tweets of account, and the number of retweets received by the tweets of account.Similarly, the number of followers, the ratio between the number of followers over the number of following, the ratio of the number of tweets liked by others, the ratio of the number of tweets retweeted also can be slightly manipulated by using a group of bots.Bots use various tools to do automated tasks such as following a user, sending a tweet.Some works investigate a number of last tweets of an account in order to reveal if the account posts spam tweets whose contents are almost identical to the tweets recently posted which is useful to detect spam distributed by bots, a set of accounts under the command and control (C&C) infrastructure.Account-based features are lightweight enough to be used detecting real-time spam which requires instant analysis.The number of lists the user is a member of can be considered a useful metric to detect spammers since it is an obvious sign of the user's impact on others but it is open to manipulation by creating fake lists and adding the fake accounts which are under the C&C infrastructure into these lists.Account-based features are lightweight enough to be used detecting real-time spam which requires instant analysis but they can be easily manipulated by spammers [37].
Tweet-based spam detection methods use parts of a tweet such as mentions, hashtags, the number of likes the tweet received, the number of retweets the tweet received, the content of tweet, lexical analysis of the tweet, the URL of the tweet, the location of the tweet, the post date of the tweet.Since the most common way to spread spam is sharing via a malicious URL [92], URLs of tweets are needed to be inspected.Therefore, almost all Twitter spam detection methods inspect URLs of tweets.The traditional ways to filter spam are based on IP blacklisting [93], domain and URL blacklisting [94].Since spammers tend to use shortened URLs, traditional URL or IP blacklisting methods are not able to filter malicious URLs in Twitter.Also, Grier et al. [36] show that methods based on blacklisting are too slow to protect users since there is a delay before the malicious URLs are included in the database.Similar to account-based features, tweet-based features are lightweight enough to be used detecting real-time spam which requires instant analysis.
Graph-based spam detection methods use features of relationships between the sender and the mentions of a tweet such as connectivity and distance to analyze how these accounts are connected each other and to measure strengths of their connections in order to reveal the possibility of a spam connection.Graph-based features are hard to be manipulated [21], unlike account-based and tweet-based features.However, extracting of these features require in-depth analysis on the huge and complex Twitter graph which is time and resource intensive.Therefore, unlike account-based and tweet-based features, graph-based features are not lightweight enough for real-time spam detection.Another limitation of the graphbased approaches is that they assume that tweets come from friends are benign regardless of their content [21] which is not valid when attackers steal the accounts of legitimate users for their malicious aims.

VI. CONCLUSION
Twitter is the most popular microblogging platform which provides easy-to-use user experience thanks to its architecture.This popularity attracts the attention of spammers who post tweets to phish legitimate users by directing them to malicious websites through the URLs shared in tweets, spread malicious software and advertises through URLs shared within tweets, aggressively follow/unfollow legitimate users and hijack trending topics to attract their attention, propagate pornography.In August of 2014, Twitter has revealed that 8.5% of its monthly active users which equals approximately 23 million users have automatically contacted their servers for regular updates.Since Twitter has unique characteristics from email services and websites, traditional spam filtering methods are not able to detect spam in Twitter.Thus, a more robust spam detection approach which is specially designed for Twitter is needed.In order to provide a spam-free environment, tweets of spammers are needed to be detected and filtered as well as the owners.By doing this, it is critical to reduce false positive detections in order to prevent legitimate users to be classified as spammers.In this paper, the features of Twitter spam detection and proposed approaches in the literature are discussed with considering their advantages and disadvantages.Also, the outdated features of Twitter which are commonly used by Twitter spam detection approaches are highlighted.Some new features of Twitter which, to the best of our knowledge, have not been mentioned by any other works are also presented.

Fig. 2 .
Fig. 2. The relationships between lists and users

Fig. 3 .
Fig. 3.The user interface of Twitter which is used to report an account by selecting the reason III.FEATURES OF TWITTER SPAM DETECTION The features of Twitter spam detection are categorized as follows: (1) Account-based features, (2) tweet-based features, and (3) relationship between the tweet's sender and receiver.These features are the mainframes of the features used by the related works in literature.Each feature category is discussed in the following subsections.

TABLE .
IV. OUTLINE OF THE RELATED WORKS INCLUDING THEIR METHODOLOGIES, THE CATEGORIES THEIR METRICS ARE BASED ON, AND ACCURACIES