A Computational Approach to Explore Extremist Ideologies in Daesh Discourse

This paper uses a computer-based frequency analysis to present an ideological discourse analysis of extremist ideologies in Daesh discourse. More specifically, by using a computer-assisted text analysis, the paper attempts to investigate the hidden extremist ideologies beyond the discourse of the first issue of Rumiyah, one of the main digital publications of Daesh. The paper’s main objectives are to expose hidden ideologies beyond the mere linguistic form of discourse, to offer better linguistic understanding of the manipulative use of language in religious discourse, and to highlight the relevance of using a computer-based frequency analysis to discourse studies and corpus linguistics. The paper also employs van Dijk's ideological discourse analysis, by adopting his positive self-presentation and negative other-presentation strategies. Findings reveal that Daesh discourse in Rumiyah is rhetorically structured to hide the manipulative ideologies of its users, which in turn functions to reformulate the social, political and religious attitudes of its readers. Keywords—Computational linguistics; concordance; Daesh; frequency analysis; ideology; Rumiyah


I. INTRODUCTION
This study presents a computer-based frequency analysis to explore the extremist ideologies in the discourse of Daesh's Rumiyah. This computational linguistics treatment is based on both a frequency distribution analysis conducted by the program of concordance and van Dijk's [1] ideological discourse analysis, by adopting his positive self-presentation and negative other-presentation strategies. Such a targeted linguistic treatment is emphasized by Smith [2] who reports that the emergence of some extremist religious movements, with their intentional discursive attempts to maintain their manipulative ideology, paves the way for more counter linguistic analysis, and opens new scopes of linguistic studies in the field of ideological discourse. This paper, therefore, attempts to investigate the extremist ideologies encoded in one of Daesh's publications: Rumiyah [3]. The study, therefore, is analytically based on both a computational approach, represented by the program of concordance, and van Dijk's [1] ideological discourse analysis.

A. Significance of the Study
The significance of this paper lies in its attempt to demonstrate the relevance of applying a computer-based frequency analysis to the linguistic analysis of texts. This is conducted by shedding light on the extent to which computer software packages, such as concordance can effectively be used to help researchers from different research domains to arrive at concise and accurate results during the process of analysis. In light of the current study, concordance is applied to one of Daesh's publications (Rumiyah magazine) in order to uncover the hidden ideologies of its discourse. This might help readers of such magazine to resist the misleading information and deceptive tactics that depend on religious argumentations. The paper, therefore, might contribute to the field of computational linguistics, because it attempts to show the analytical integration of computer software programs, linguistics, and ideology.

B. Research Questions
This paper attempts to answer the following research questions: 1) To what extent does a computer-based frequency analysis help in the linguistic analysis of texts and talks?
2) What are the hidden ideologies encoded in the discourse of the selected magazine?
3) How can these ideologies be decoded by means of a computer-based frequency analysis?

C. Objectives of the Study
There are three objectives this study tries to achieve: 1) To explore the extent to which discourse is religionized to encode specific ideologies of its users.
2) To uncover the hidden extremist ideologies beyond the mere linguistic structures of Daesh's Rumiyah.
3) To prove the relevance of using computational analyses to discourse studies and corpus linguistics.
The rest of this study is organized as follows. Section 2 presents the literature review of the study, by shedding light on: (i) computational linguistics represented by concordance and its frequency distribution option, (ii) the notion of ideology within the scope of ideological discourse analysis, and (iii) a brief account of positive self-presentation and negative otherpresentation strategies. Section 3 offers the methodology of the study, which comprises the approach of the study and data collection and description. Section 4 is devoted to data analysis. Section 5 is a conclusion and provides some recommendations for further research.

A. Computer-Based Frequency Analysis
The frequency analysis attempted to be applied in this study will be conducted by the computer software program: www.ijacsa.thesai.org concordance. Concordance is a computational program which aims to access different types of texts in order to reveal the frequency each given word occurs in the text under analysis. Besides the number of occurrences a word has in text, this program also provides information about the contextual environment of any specific word highlighted for the analysis [4]. In this regard, Sinclair postulates that concordance is "a collection of the occurrences of a word -form, each in its own textual environment. In its simplest form, it is an index. Each word -form is indexed, and a reference is given to the place of each occurrence in a text" [5]. Concordance further offers certain analytical clues derived from frequencies of words. These clues help analysts arrive at results pertaining to various analytical purposes. Among these clues are collocations that shed light on the contextual world of a specific word in text. [6]. Concordance is very significant for any linguistic analysis and considerably contributes to the analytical weight of texts.
According to [7], the use of concordance in textual analysis has provided certain computational applications that deal with various analytical concepts, such as cataloguing, concordance, the analysis of form, of content and of the syntactic nature of texts. Concordance then can analytically be applied to the different types of linguistic analyses: syntactically, semantically, pragmatically, etc. Further, concordance is perceived as "a formatted version or display of all the occurrences or tokens of a particular type in a corpus [8]". According to [8], the type is usually called "a keyword but is sometimes referred to as a target item, node word or search item".
The nature of concordance is emphasized by Hockey [9], who argues that concordance is analytically manifested when the contextual background (i.e., the words in its company) of any searched word is revealed in text. This is also supported by [8], who postulates that concordance allows its users to identify the contextual environment of words and to recognize the different senses of a word type. In light of this paper, concordance is intended for providing frequency distribution analysis to the selected data. Its application functions to facilitate a much thorough and comprehensive study than would otherwise be possible. Therefore, concordance, here, aims to provide one verifiable input: Frequency Distribution. This computational option tends to offer the frequency of the searched word within its textual and contextual world.
Furthermore, concordance is perceived as an important tool in corpus linguistics [9]. This program provides texts' analysts three search options: the first option is frequency list, in which frequencies of specific word is generated in the corpus; the second option is collocations, which offers a combination of two words that co-occur together in text; and the third option is keywords, wherein a searched word is shown in accompany with the words preceded and followed it. For [9], the three options provide an analytical support for text analysis. That is, they help in demonstrating both the number of occurrences of a specific searched word, as well as its contextual environment.

B. Ideology and Ideological Discourse Analysis
The study of ideology in discourse has been the focus of many linguists [1,10,11,12,13,14,15]. From a sociocognitive perspective, van Dijk [5] perceives ideology as a social and cognitive form that is shared by a particular group or party. He argues that ideology is constituted by a number of ideas and beliefs that are perceived as group beliefs not personal ones. van Dijk [6] emphasizes that ideologies are close to what he calls "socially shared group knowledge", such as the specific knowledge shared by a number of individuals within the same institution or speech community. Ideologies are the driving force that shape and reshape the discursive practices among participants.
Ideologies "show a polarizing structure between US and THEM" [1], which indicates that they may be visualized as schemas of self-group. It can be assumed that each group is supposed to formulate its own conceptual schemata that frame its organizational patterns as well as its relationships with other groups. As such, ideologies constitute categories that abound in activities, norms, values and goals that are considered as dominant features that shape social positions and attitudes. These categories, therefore, are the main tenets that form the ideological schemata of any group in society. Such schemata help different group members defend their own interests and communicate their ideological beliefs [16,17,18].
Ideological discourse analysis is a discourse model which concentrates on investigating ideologies, their structures and representations in texts and talks. van Dijk [19] emphasizes that this model of discourse can be considered as one form of sociopolitical analysis of discourse because it related structures of discourse with structure of society. Ideological discourse analysis, according to van Dijk, is not only concerned with discovering the hidden ideologies in discourse, but also focuses on clarifying how structures of discourse are incorporated, intermingled and affected by the social structures of society.

C. Positive Self-Presentation and Negative Other-Presentation
The positive presentation of the in-group and the negative presentation of the out-group are among the most valuable ways of analyzing forms of ideological discourse. van Dijk [10] points out that these strategies function to expose the good of the in-group and the bad of the out-group. This can be conducted by ascribing good traits to US (i.e. the in-group members) via repetitions, association, and intensifying strategies, on the one hand, and by attributing bad qualities to THEM (i.e. the out-group members) through dissociation and downplaying strategies, on the other hand. Within these two processes, ideologies are formed, framed and expressed in texts and talks. The representation of an individual or a group or a political party positively or negatively attempts to affect the cognitive background of the public, and to reshape the ideological attitudes of individuals. The fundamental aim beyond such a process of positive/negative presentation is to focus on all information that beautify their image and strengthen their position, on the one hand, and de-emphasize all information that misshape their opponents and undermine their status, on the other.
According to van Dijk [20], the strategies used in the process of mollification or vilification are linguistically evidenced on the different linguistic levels of analysis: lexically, semantically, rhetorically, etc. On the lexical level, for example, the choice of specific lexis plays an effective role www.ijacsa.thesai.org in the process of communicating and reflecting ideologies in discourse. Rhetorically, the employment of euphemistic or dysphemistic terms is also significant in conveying positive presentation and negative presentation. Crucially, these strategies always set a distinction between in-group and outgroup members within different discourse settings.

A. Approach of the Study
This paper uses a computer-based frequency analysis and an ideological discourse analysis in data analysis. This means that the integration between the two approaches will be shown throughout the stages of analysis in this study. This is conducted by analyzing the underpinning ideologies of the selected data, and then carrying out a frequency analysis by means of concordance of specific words that are marked as indicative in each part of the ideological analysis of the magazine under investigation. Significantly, the use of concordance functions to help arrive at accurate results that support the whole linguistic analysis attempted in this paper, that is, to reveal the extremist ideologies in Rumiyah's discourse.

B. Data
The corpus of this study consists of the first issue of Rumiyah magazine which was launched by Daesh's propaganda system and was published in September 2016. Some extracts from the selected issue are highlighted and marked as linguistically indicative in the study of ideological discourse of religious extremism. For McKernan [21], Rumiyah magazine was firstly published in September 2016 to replace its previous one entitled Dabiq. It is considered to be one of the effective propagandist tools used by Daesh to propagate its ideologies. The magazine is released in different languages, such as Arabic, English and Russian.

IV. ANALYSIS AND RESULTS
The analysis focuses on the manner through which Daesh's Rumiyah conveys its intended ideologies. In this regard, some discursive strategies, including positive and negative lexicalization and relational values of words, including mood and modality have been skillfully employed to reveal Daesh's extremist ideologies.

A. Positive and Negative Lexicalization
Lexis is always carriers of ideologies [22,23]. The extremist ideology of jihad is interpersonally reflected in Rumiyah's discourse via the employment of some positive and negative words. This ploy has been used to characterize the relationship between two different groups: Daesh and its opponents. Throughout the discourse of Rumiyah, a number of ideology-oriented vocabularies have been used to describe each group. Consequently, words such as believers, muwahiddin, mujahidin, and martyrs have been semantically antonymized by disbelievers, mushrikin, murtaddin, tawaghit and kuffar. The same oppositional lexicalization is also conveyed on the phrase level. This is clearly shown in phrases, such as righteous men, fighters for Allah's cause, persevering brothers and lions of the Ummah, which are contradictory counterparted by phrases, such as enemies of Allah, people of falsehood, wicked scholars and Rafidi murtaddin.
Crucially, this diametrically opposed lexicalization is intended to reflect an in-group positive presentation and an outgroup negative presentation, which is based on the choice of words and phrases that imply positive or negative evaluation. This also supports the idea that ideologies are encoded by means of lexis, that is, words are considered ideology carriers in discourse. The use of ideologically contested words in Daesh's Rumiyah, therefore, is highly indicative in two ways: first, it emphasizes the meaning of jihad as one of the main religious ideologies of Daesh that is reflected by the use of words and phrases whose meanings connote the meaning of jihad, whether associatively or incompatibly; second, it demonstrates the interpersonal relationship between Daesh as a positively self-presented in-group and its opponents as a negatively other-presented out-group. Tables I and II show a computer-based frequency analysis wherein the number of occurrence of positive and negative lexicalization in the selected issue of Rumiyah is reflected. Tables I and II clarify that some words are frequently used to describe Daesh's affiliates positively and its opponents negatively. The high frequency of words, such as mujahidin, brothers and believers, on the positive side; and murtaddin, mushrikin, kafir and kuffar, on the negative side, indicates the conflicting way Daesh perceives its members and opponents. Indicatively, a simple look at the frequencies pertaining to each word in the above tables demonstrates that one can obtain specific information about the general ideological atmosphere of Daesh's discourse in Rumiyah. This, in turn, serves to emphasize the importance of applying a computer-based frequency analysis to discourse studies and corpus linguistics.

B. Mood
Many studies discussed the notion of mood as a discursive device which is used to communicate ideologies in discourse [24,25,26,27]. This concept reflects the relational values between participants in discourse. Mood is realized in Daesh's Rumiyah through two lexico-grammatical patterns: the first is speech acts, which are manifested in the type of clause structure used in discourse, that is, the way of delivering the clause; declaratively, directively, or commissively. The second pattern is modality, which refers to all the non-propositional elements of a sentence, and is also demonstrated through two types of modality: truth modality and obligation modality.

C. Speech Acts
Using different speech acts in texts and talk is one way by which mood is conveyed in discourse [28,29,30]. Daesh uses three types of speech acts: the declarative, the directive and the commissive. The following extracts add clarification: 1) This is the way of the muwahhidin in every time and place. Whenever a generation of them passes, another generation follows, holding the banner of Tawhid overhead while plunging anew into the battle for Islam, which continues to be waged against shirk and its people. (Rumiyah, issue 1, p. 3) 2) By waging war against shirk and subjecting the people to the rule of the Lord of all creation, the greater injustice is eliminated. (ibid., p. 10) 3) The great gate of jihad with wealth is left wide open for the women who will make deals with their Lord, deals that will never end poorly. (ibid., p. 20) The above extracts show different grammatical patterns of declarative sentences that revolve around the theme of jihad, which is delineated as the ultimate goal of Daesh. This is reflected by the use of some expressions, such as the battle for Islam, to be waged against shirk, waging war against shirk, the great gate of jihad. These clauses explicitly emphasize the significance of jihad in Daesh's ideological agenda. A computer-based frequency analysis is conducted here to demonstrate the extent to which the declarative mood is used in the selected data.  Table III displays that a computer-based frequency analysis helps in arriving at the total number of occurrences for a specific mood used in the selected data. Significantly, without the help of the computational work, it will be difficult to reach to such a credible and accurate result. Here lies the importance of applying computer software programs in accessing and analyzing any type of data.
On the interpersonal level, Rumiya's discourse abounds in some discursive expressions that delineate the relationship between Daesh and its opponents. This is clearly shown through the dexterous employment of religiously-based words that are used to refer to two groups: the first group includes muwahhidin, Tawhid, Islam, jihad, rewarded and jannah, whereas the second group includes shirk, injustice, dhimmi kafir, tyrant, hostile, sinful and mushrikin. Again, this emphasizes the in-group/out-group polarization in Rumiyah's discourse; those who adhere themselves to Daesh's religious ideologies (jihad and jama'ah) are members of the in-group, who are positively presented; and those who refused Daesh's ideologies are out-group enemy members, who are negatively presented. Also noticeable is the different parts of speech used in the description of the in/out-group in the magazine under investigation. Some are used to describe an action, as is the case with the words: rewarded, shedding and waging; others are employed to describe entities, such as muwahhidin and mushrikin; and a third group is used to describe a state, as in sinful and hostile. Consequently, Daesh communicates its religious ideologies interpersonally by making use of the declarative mood on the different levels of word classes: the verb, the noun and the adjective. Indicatively, the frequency analysis on such classes of words displays the extent to which both jihad and jama'ah are represented, either literally or associatively. The number of occurrences obtained from the frequency analysis adds significantly to the whole understanding of the hidden ideologies in the discourse of the selected magazine. Consider Tables IV and V. Another interpersonal observation in the discourse of Rumiya lies in the grammatical use of the nominal gerund. In nominal gerunds are represented in holding, plunging, waging and shedding. These gerunds function as nouns of non-finite clauses within the larger structures of their sentences. As such, they serve as subjects of the larger sentences and add a sense of continuity to action. The meaning of such gerund clauses, then, may be: you (Daesh's soldiers) should continue holding the banner of Tawhid, waging war against shirk, and shedding the blood of a non-dhimmi kafir because these things are not sinful and are rewarded with Jannah. The following frequency analysis displays the high frequency of nominal gerunds in Rumiyah. As indicated in Table VI, nominal gerunds occur 27 times in the selected data. Again such a result can precisely be arrived at by a computer frequency analysis.
A further observation relates to the addressee's gender, that is, it is not only male participants who are supposed to commit themselves to Daesh's jihad, but women are discursively addressed as another discourse participant to share the same ideology of jihad as well. In extract (3) above, women are instigated to carry out a specific type of jihad, that is, jihad with wealth. This type of jihad, as understood from its name, is based on giving money to the so-called Islamic State so as to be used in military operations. Here, a new dimension of Daesh's jihad is stated; it is a type of jihad that is no longer committed physically, but rather financially. To highlight this idea, concordance has a role to play, as is shown in Table VII. The frequency analysis of the words male and female in Table VII casts emphasis on revealing the meaning that both men and women are targeted by Daesh discourse. This result is strengthened by the frequency analysis conducted above.
The directive mood is interpersonally communicated through the use of imperatives that are employed to accentuate its religious ideologies. The discourse of Rumiyah witnesses a strongly directed message of jihad through a number of imperatives. Using imperatives in discourse, according to [31] and [32], allows speakers to address their recipients clearly and directly and, thus, to practice power over them. Here, Daesh attempts to create a direct communicative channel with its recipients through which it can communicate its religious ideologies. Obviously, the use of such imperatives constitutes the meaning of jihad both explicitly and implicitly. Verbs, such as strik, scorch, kill, stab, shoot, poison, run down, and fight comprise a direct call towards violent action against Daesh's opponents, whereas verbs like stand, die, follow, and mobilize are indirect references to the same idea of jihad. The implicit jihad here is conveyed by the fact that acts of standing, dying, following, and mobilizing can be only realized via completing the course of Daesh's soldiers who died in their fighting against enemies. The frequency analysis in Table VIII shows that imperatives occur with the frequency of 24 occurrences, all of which are employed to communicate the meaning of jihad in Rumiyah. Another significant mood utilized in Daesh's Rumiyah to convey its religious ideologies is delivered commissively. Daesh's discourse presents clauses, such as we will not rest from our jihad, their slaying will not harm the Islamic State, they shall shed many tears, we would have made effort to open the door… and this Ummah will be victorious carry a strong commitment to some future actions that are expressed in the manner of vowing, threatening and promising, respectively. Daesh's commissive mood above has been characterized by three things: first, all commissives above revolve around the meanings of jihad and jama'ah. Both concepts have commissively been represented explicitly, through vowing and, implicitly, through threatening and promising; second, all commissives have been confirmed by the use of the truth modal will. This adds a sense of certitude and credibility to the pragmatic message of the commissive mood; and, third, the idea of jama'ah, which is indirectly communicated by the commissive this Ummah will be victorious, is preconditioned by the prepositional phrase through your sacrifices. This conditional promise aims to influence the attitudes of Daesh's soldiers and drive them to offer more sacrifices in order for the Ummah to be victorious. With the help of a computer-based frequency analysis, accurate results of the number of occurrences of commissives in the selected magazine are registered as is shown in Table IX.

D. Modality
A further device used to communicate ideologies is modality which is discussed by many linguists in previous literature [e.g., 33,34,35,36]. Daesh employs two types of modality: obligation modality and truth modality. Both types aim to convey Daesh's jihad and jama'ah. Consider the following extracts: 1) This religion will remain established and will not be damaged by the death of any person. (Rumiyah, issue 1, p. 2) 2) A generation has been born in the Islamic state… that will not accept humiliation. (ibid., p. 37) Daesh utilizes the truth modal 'will' in the clauses: will remain, will not be damaged, and will not accept in the above extracts to communicate trustworthiness, and to prove the validity of its arguments. Crucially, the use of truth modals reflects the degree of certitude which is often connected with the notion of authority a discourse participant practices over another (Yule, 1996). Here, by employing the truth modal will, Daesh tries to establish itself as having the discourse access of authority over its members, which makes it appear as a religion defender. This authoritative role is not only stated by the use of the truth modal will in extract (1) above, but also by the antonyms established and damaged. The meanings of the two words, however incompatible, remain complementary in confirming the concepts of jihad and jama'ah.
Another important type of modality, which is employed in Daesh's Rumiyah, is obligation modality. This type of modality is expressed by modals, such as 'must' and 'shall' as in the following extracts: 1) Men shall continue to be employed by Allah to frustrate the kuffar. (Rumiyah, issue 1, p. 3) 2) Muslims currently living in Dar al-Kufr must be reminded that the blood of the disbelievers is halal. (ibid., p. 36) The obligation modals shall in men shall continue and must in must be reminded reflect the power of Daesh over its members. This nonreciprocal relationship of power is employed in discourse situations where one powerful participant dominates another. Daesh uses this type of modality to impose its own ideology over their recipients. The use of the agentless passive in Muslims must be reminded in extract (2) above signifies to "leave causality and agency unclear" [8]. As such, this grammatical feature can be said to have an experiential value in the sense that it leaves the responsibility of 'reminding' Muslims unspecified. Consequently, people all over the world, who believe in Daesh's ideological agenda, are responsible for reminding Muslims that the blood of the disbelievers is halal. In Islamic traditions, the word halal carries the speech function of permission and, therefore, affirms the associative meaning of Daesh's jihad. Tables X and XI present the frequencies of "will", "must" and "shall" in Rumiyah. Tables X and XI clarify that the truth modal "will" and the obligation modals "must" and "shall" have total frequencies of 71, 6 and 9, respectively. Only 25, 4 and 2 occurrences of them are indicative in conveying the concepts of jihad and jama'ah. Significantly, the use of a computer-based frequency analysis helps in indicating the indicative and the non-indicative occurrences of the modals in the above table.

V. CONCLUSION
This paper uses a computer-based frequency analysis and van Dijk's model of ideological discourse analysis to present a linguistic analysis of extremist ideologies in Daesh's Rumiyah. The analysis shows the relevance of applying computer software programs to the linguistic analysis of texts. It is also evidenced that the use of a frequency analysis helps arrive at accurate and credible analytical results, as well as yields better understanding of the textual and contextual meanings of the text under investigation. The analysis, supported by concordance, demonstrated that Daesh's discourse in Rumiyah addressed two main extremist ideologies: jihad and jama'ah. These ideologies have been traced and reflected semantically, through patterns of interpersonal meanings manifested in positive and negative lexicalization, and mood (speech acts and modality). The semantic meanings of these ideological concepts have shown an increasing emphasis on religious ideas that are based on a clever process of intertextuality [37]. These ideas in turn serve to promote extremism and violence against the other, and create a group-oriented religious discourse, which abounds in meticulous ideological and discursive structures that aim to intensify the ideological polarization between in-groups and out-groups. www.ijacsa.thesai.org The study also revealed that Daesh's Rumiyah is apparently a propagandist protrusion to a specific ideological agenda. The textual organization of the magazine and its contextual atmosphere are integrated to produce the final discursive image of Daesh. This image is computationally and semantically delineated to establish a legitimate positive self-presentation to Daesh in a way that, on the surface, displays a reciprocal persuasive type of discourse, while, implicitly, shows a nonreciprocal extremist one. This, in turn, enables this movement to implant its extremist ideologies and to advocate its schematic violent goals. This paper recommends further applications of computeraided text analysis programs to other texts. This could reveal different findings other than what is approached in the current study. It might also demonstrate the extent to which computer software programs are analytically relevant to discourse studies and corpus linguistics.