Toward Information Diffusion Model for Viral Marketing in Business

Current obstacles in the study of social media marketing include dealing with massive data and real-time updates have motivated to contribute solutions that can be adopted for viral marketing. Since information diffusion and social networks are the core of viral marketing, this article aims to investigate the constellation of diffusion methods for viral marketing. Studies on diffusion methods for viral marketing have applied different computational methods, but a systematic investigation of these methods has limited. Most of the literature have focused on achieving objectives such as influence maximization or community detection. Therefore, this article aims to conduct an in-depth review of works related to diffusion for viral marketing. Viral marketing has applied to business-to-consumer transactions but has seen limited adoption in business-to-business transactions. The literature review reveals a lack of new diffusion methods, especially in dynamic and large-scale networks. It also offers insights into applying various mining methods for viral marketing. It discusses some of the challenges, limitations, and future research directions of information diffusion for viral marketing. The article also introduces a viral marketing information diffusion model. The proposed model attempts to solve the dynamicity and large-scale data of social networks by adopting incremental clustering and a stochastic differential equation for business-to-business transactions. Keywords—information diffusion; viral marketing; social media marketing; social networks


I. INTRODUCTION
Online social networks have become part of our daily lives and hundreds of millions of accounts have spread over different social media channels.The spread and development of the Internet and mobile technology has affected the growth of social networks.For instance, Twitter has half a billion users worldwide who produce around 175 million tweets per day or 10,000 tweets per second [1], [2].The massive volume of information traded over social networks should be analyzed and used for marketing purposes.The study reviews diffusion methods including data mining, evolutionary methods, and statistics.It will also introduce the viral marketing information diffusion (VMID) model that aims to utilize data mining and a stochastic differential equation to achieve the best possible diffusion outcomes.
According to Bennett [3], around 70% of customers trust purchasing recommendations from their friends or close relatives, and only 34% under the age of 30 view product ads on television.Also, 87% of marketers say they would like to know how to measure the return on investment (ROI) through online social networks [3].Brown and Fiorella [4] raised the question of why businesses should be on social media.Marketers should ask the same question before starting a marketing campaign via a social network.Other relevant questions include "Why should we choose this certain social media channel instead of others?" and "Which users have the power to influence other users?" To spot new research advancements in information diffusion for viral marketing, we performed a search for various topics related to information diffusion for viral marketing in business.The search encompassed the last few years.Fig. 1 shows the recent enormous increase in research interest on the topic.This motivates a depth discussed around it in the following sections.This article asks the following research question: "Are there methods or techniques that consider information systems to enhance and optimize the marketing message diffusion?"The goals of this article are to find gaps and limitations in the existing research and to provide new insights into the topic.All the studies reviewed in this article consider a computer and information systems perspective.The answers to the research www.ijacsa.thesai.orgThe remainder of this article is organized as follows: Section II provides a general background of online social networks and viral marketing.Section III discusses the existing information diffusion models in viral marketing and divides them into four according to their role: synchronous, asynchronous, influence maximization, and social network mining.Section IV discusses social media marketing in business.Section V presents the methodology and discusses the current challenges to diffusion in online social networks for viral marketing.It also introduces the proposed VMID model.Section VI concludes the article and provides recommendations for further research.

II. BACKGROUND
This section provides a background of the two core concepts of this article: online social networks and viral marketing.Online social networks are the underlying environment and platform for the diffusion of viral marketing messages.

A. Online Social Networks
Social Network can be represented as graph.Easley and Kleinberg [6] defined a graph as "a way of specifying relationships among a collection of items.A graph consists of a set of objects, called nodes, with certain pairs of these objects connected by links, called edges."Social networks can be represented as a set of nodes and edges forming a network or graph, where the nodes are the participants and the edges are the types of connections.More specifically, in social networks, people or groups of people are the nodes, and the types of social interaction they engage in are the edges.Attention should be directed toward the connectedness in these networks.Easley and Kleinberg [6] emphasized this through two fundamental observations: interconnecting links and interdependence in users' behavior.Kempe, Kleinberg, and Tardos [7] also described a social network as "the graph of relationship[s] and interactions within a group of individuals."Fig. 2 illustrates the graph structure, which demonstrates the general case (unweighted and undirected).
There is much interweaving between static and dynamic social networks; they all have the same structure of representation of nodes and relationships.However, The nodes and their relationships continuously change over time in dynamic networks.The nodes and relationships in dynamic social networks are generated once there is direct communication or information flow between two or more nodes.Kempe et al. [7] showed that successful marketing is achieved by dealing with the information dynamics in the network rather than studying the structural properties of the graph.This has directed interest toward dynamic social networks, unlike most of the existing research that focuses on static networks [8]- [10].This leads to a need to examine both the dynamics of information and the network structure to achieve the best marketing strategy.

B. Viral Marketing
Online social networks have become part of people's daily activities; therefore, companies and organizations tend to harness the rapidity provided by social networks.Viral marketing is a marketing method that uses social networks.It is based on people who have influence over their relatives and friends in social networks.Researchers can measure the influence of these people by nodes and ratios of propagation in online social networks.Companies target these influencers because marketing through them will have a great impact on their products.The most crucial part of viral marketing is choosing the right influential nodes, called seeds, and then choosing the best diffusion method.
Viral marketing is a way of advertising products to a specific group of interested people using online social network relationships.Long and Wong [9] defined it as "targeting a limited number of users (seeds) in the social network by providing incentives, and these targeted users would then initiate the process of awareness spread by propagating the information to their friends via their social relationships."The viral marketing phenomenon is described as influencespreading over social networks [11].The process of viral marketing, influence marketing, or WOM is divided into three stages: 1) initiating the advertising message , 2) locating the best seeding nodes, and 3) diffusing the marketing message to others [12].
The complexity of social network structures requires that marketers build and follow a procedure to spread information accurately.Social networks concentrate on human behavior such as opinions, recommendations, and reviews; understanding this behavior and its consequences is a key factor to success in viral marketing [8], [9].Generally, the viral marketing process consists of two main stages: seeding and diffusion [9].Kitsak et al. [13] found that the core spreaders are more effective than those that have more connections.Viral marketing targets a limited number of users to begin marketing; these users have a sufficient impact on another group of interested users.Choosing the optimal seeding is an NP-hard problem, as proven by [7].Minimizing the number of seeds to reduce the overall cost is the objective of viral marketing.To satisfy this objective, Long and Wong [9] introduced and studied the J-MIN-Seed problem.Since this review concentrates on information diffusion in viral marketing, the following section will provide a detailed discussion of information diffusion methods.www.ijacsa.thesai.org

III. INFORMATION DIFFUSION
The Oxford Dictionary defines diffusion as "the spread of something."In the analysis of social networks, diffusion is the process of information diffusion via the network.The majority of current research focuses on information diffusion in online social networks [14]- [16].Viral marketing and diffusion in online social networks are largely interwoven because they share the same underlying environment, diffuse the same content, and have the same objectives.The basic idea of information diffusion in online social networks originated from the concept of virus spread.Most epidemiological diffusion models are considered non-graphical approaches because they assume no stability in the network structure [15].Newman [17] categorized epidemiological models into two: the susceptible-infected-removed model (SIR) [18] and the susceptible-infected-susceptible model (SIS) [19].Another diffusion approach is based on rumors spread.Boccalettia, Latorab, Morenod, Chavez, and Hwang [20] classified rumor receivers into ignorant (does not care about it), spreader (willing to spread it), and stifler (ceases to spread it).This section divides the models according to the technique used: synchronous, asynchronous, influence maximization, and social network mining.

A. Synchronous Diffusion Models
This section discusses the two dominant diffusion models: linear threshold (LT) and independent cascade (IC).These models assume a discrete time event and a set of predefined and activated nodes.The diffusion process is mainly based on the probability defined over each edge and how one node influences others to diffuse information.Kempe et al. [7] proposed a generalized version of each model and showed that they are mostly equivalent.For this reason, Lu, Wen, and Cao [21] developed a community-based algorithm and a distributed set-cover algorithm based on the probabilistic diffusion model because of the limited ability of LT and IC in large-scale networks.
1) Linear Threshold Model: This model relies on randomly choosing a threshold θ between 0 and 1 for each node [22], [23].Each node i has a weighted edge between this node and all of its neighbors, or W i,j .If the total weight of all the adjacent nodes satisfies the following condition, then the node becomes active: where n is the total number of edge weights of active adjacent nodes.In particular, at time t, all the active nodes in time t − 1 remain active.All the nodes that satisfy the above condition also become active [7], [11], [15].
There are number of studies worked under LT model to solve problems related to diffusion [24]- [27].The work done by Galuba, Aberer, Chakraborty, Despotovic, and Kellerer [24] aimed to predict an information cascade graph.Chen, Yuan, and Zhang [25] conducted a study of influence maximization in an LT model.They introduced the first scalable heuristic algorithm aimed at maximizing influence within the LT model, called the local directed acyclic graphs (LDAG) algorithm.Khalil, Dilkina, and Song [26] proved that the LT model has an

Receiver-centric
Sender-centric enormous effect on the problem of network modification.However, Guisheng, Jijie, and Hongbin [27] claimed that the LT model was inefficient in real-life social networks.Therefore, they introduced the cellular automaton-based network diffusion (CAND) model.

2) Independent Cascade Model:
The IC model is a mathematical model for the diffusion process in directed graph G= < V, E>, where V represents nodes and E represents connected links [11], [28].The model uses the probability defined over each edge that connects the nodes [29].The activated nodes try to activate inactive nodes using the probability value of each edge [15].The model initially defines a random set of active nodes to activate the inactive nodes connected to them.Similar to the LT model, the activated nodes in time t-1 remain active in time t [7], [23].The process starts using the active node i, which activates the connected inactive node j.
Studying the node strength is a challenge because there is no unique measurement for it.Arnaboldi et al. [16] claimed that knowing the node strength will assist in inferring the information diffusion.Zhu, Wang, Wu, and Zhu [30] proposed SpreadRank, which measures the spread ability of each node.It is a generalized version of the diffusion model CTMC-ICM, which introduces the theory of continuous-time Markov chain (CTMC) into the IC model.A number of studies have enhanced the IC model, such as that of Nazemian and Taghiyareh [31], who proposed the IC with positive and negative WOM (ICPM).
TableI summarizes the main differences and similarities between the two diffusion models.Both models are applied on static and directed graphs.The main difference between the two is the diffusion method.The LT model uses influence degree, while the IC model uses probabilities.IC has greater coverage than LT because most of the graph nodes are covered.On the other hand, LT is more accurate than IC because only the nodes that satisfy the condition/rule are involved in the diffusion process.Although both models work effectively, many enhancements have been made to increase their performance, especially regarding time.

B. Asynchronous Diffusion Models
The methods presented in this section have been used in viral marketing diffusion and information diffusion in online social networks.This section will discuss asynchronous methods that could have a significant impact on dynamic viral marketing diffusion.Saito, Ohara, Yamagishi, Kimura, and Motoda [32] proposed an estimated parameter based on www.ijacsa.thesai.orginformation diffusion resulting at a specific time; this ensures an asynchronous pattern in the same diffusion models (AsIC and AsLT) [33], [34].Opinion propagation was one of the problems solved by Kimura et al. [34] via an extension of AsIC and AsLT.They created the extension using a valueweighted voter model with multiple opinions.Similarly, using the same learning performance model, Kimura, Saito, Ohara, and Motoda [35] used the value-weighted voter model to detect anti-majority opinions in social networks.
The T-BaSIC model presented by Guille and Hacid [36] predicts the temporal dynamics of diffusion in social networks.The approach is based on machine learning techniques and the inference of time-dependent diffusion probabilities from a multidimensional analysis of individual behaviors.In addition, Wonyeol, et al. [37] generalized the IC model by including two important aspects, continuous trials and time restriction, in a new model called the continuously activated and timerestricted IC (CT-IC) model.This is an influence diffusion model dedicated to viral marketing.They also proposed the continuously activated and time-restricted independent path algorithm (CT-IPA) to compute the exact influence spread path.Zhu et al. [30] presented one of the applications of the IC model.They introduced a new method that enhances the influence maximization problem by concentrating on a small group of influences.

C. Influence Maximization
The majority of current research considered static network topology.A static network topology assumes that the structure remains the same over time.The literature has begun to examine the effect of choosing the best initial influential nodes.Kempe et al. [7] identified the challenge of optimizing the selection of the most influential nodes in the network and introduced an approximation algorithm for influence maximization.The problem with their approach is the time required for simulation processing.The limitation of the approximation algorithm [7] was its ineffective performance in large-scale mobile social networks, especially in dealing with the diffusion minimization problem.The challenge is choosing the best set of users who can maximize spread in the network.Chaudhury et al. [14] found that their method of spreading information was faster than the greedy k-center method.They proposed the degree-based scaling method applied in the graph.The method aimed to increase the active set of nodes that have the topmost degree with the least possible amount of time.Both algorithms ensured optimal seeding regardless of the amount of time consumed.Abadi and Khayyambashi [38] discussed the problem of influence maximization with viral marketing and introduced a new algorithm.Their proposed algorithm aimed to select the experts and leaders in social networks based on spatiality and knowledge.Although they claimed that the algorithm was quick, it required long processing steps to produce reasonable results.Zhou, Zhang, and Cheng [39] addressed the influence maximization problem by integrating greedy algorithms and mining the top influences.They proposed GAUP to mine the most influential nodes in the network.In addition, Kim, Beznosov, and Yoneki [40] introduced a decentralized influence maximization problem by influencing k-neighbors rather than randomly selected users in the network.
All the aforementioned studies focused on maximizing influence by concentrating on the social network structure.Research on social network dynamics has been increasing.For instance, Zhuang, Sun, Tang, Zhang, and Sun [41] concentrated on maximizing influence in dynamic social networks.In response to dynamic challenges, they proposed an influence maximization algorithm called Maximum Gap Probing (MaxG).

D. Social Network Mining For Viral Marketing
Researchers have only recently begun to examine systematically the effect of applying mining methods such as clustering or classification to viral marketing.This section explores mining methods for viral marketing.
1) Clustering: In general, clustering groups objects that have similar attributes.Han, Kamber, and Pei [42] defined clustering as "the process of partitioning a set of data objects into subsets."The subset represents a cluster; each cluster contains objects that are similar to each other and different from others in another cluster.Among the current trends in mining social networks for viral marketing, clustering analysis methods have gained the most interest because of their ability to deal with a social network's structure, correlations, and groups.The idea of clustering as a machine learning model is to identify homogenous groups of people [43].
Few studies have considered clustering for diffusion in viral marketing.Banerjee, Al-Qaheri, and Hassanien [44] combined fuzzy logic, game theory, and clustering within social network mining for viral marketing.Since marketing in social networks depends on the dynamics of social influence interaction, Banerjee et al. [44] applied game theory.Clustering was used to determine the representative group of customers.Sharma and Shrivastava [45] used clustering to identify the highly influential nodes for viral marketing.Two clusters were used as a strong tie cluster and weak tie cluster.They also introduced two algorithms: cluster mining and influence mining.AlSuwaidan, Ykhlef, and Alnuem [46] introduced a novel spreading framework for viral marketing that ensures optimization of cost and time and predicts the required cost and time to reach sufficient coverage.The framework is based on incremental clustering and activity networks and is directed toward the most active user in the social networks.Among the works presented in the literature that have discussed the information diffusion in online social networks, Niu, Long, and Li [47] used a k-means algorithm to cluster users into different groups.They also proposed a continuous time diffusion model by incorporating users' heterogeneous temporal diffusion patterns.This model is also applicable to viral marketing diffusion because it is applied on the same constrained environment and has the same objective Some other methods, such as influencer selection, use clustering for identification problems.Lou and Tang [48] developed a model for mining top-k structural hole spanners in large-scale social networks.The general idea behind this was to measure how a node bridges different communities.This was the first attempt to prove the NP-hardness of maximizing the decrease of minimal cut in an unweighted graph.Shrihari, Hudli, and Hudli [49] proposed an approach for identifying influencers that uses a k-means clustering algorithm.The identification process of opinion leaders (influencers) was based www.ijacsa.thesai.org on many factors such as the amount of time they spend in social networks, their positive comments, and their responses to others.
2) Classification: Classification is a supervised modeling technique.The majority of produced classification methods attempt to predict or estimate future events based on historical data.What is important to note in modeling using classification is that the target groups or classes are known from the beginning; therefore, classification techniques are not useful in mining social networks [50].However, most classification studies on mining social networks for diffusion consider classification as a method for making classes or organizing user behavior or properties before the diffusion process occurs.Surma and Furmanek [51] used classification in mining social networks for viral marketing.They used a classification and regression tree (C&RT) model to identify users of online social networks who will respond to marketing campaigns.
Classification has also been used for predicting user behaviors in online social networks.This simplifies the diffusion process because it determines the best possible node to start from based on past behavior patterns.Ortigosa, Carro, and Quiroga [52] used classification as a mining method for predicting user personality based on interaction behavior patterns.The prediction process was based on parameters such as number of friends and wall posts.Ortigosa et al. [52] developed an application on Facebook called TP2010 to gather information from approximately 20,000 users to determine their interaction personality patterns.They used the classifier as a machine learning technique to observe interaction patterns among users 3) Community Detection: Community detection is one of the most widely used methods in the literature on analyzing social influence.It relies on the idea of targeting the communities that are most closely related to a certain domain through social networks.Viral marketing has the same objective, since locating the right community is the first step toward effective viral marketing.However, Most of existing methods perform community detection at random basis [53].In this section, a review of existing community detection methods and models that have been applied to online social networks will be presented to emphasize the importance and current lack of applying community detection to viral marketing.
The focus of community detection is on the community structure because of its usefulness in knowing how the network is structured [54].Many researchers have raised questions about the definition of community and its structure.Shen [55] highlighted the need for a clear definition of community; he claimed that this definition is dependent on its context and applications.Tang and Liu [56] and Shen [55] defined community as a group of nodes that are densely connected with each other more frequently than with those outside the group.
Bhat and Abulaish [12] introduced a community detection approach to viral marketing.Their work concentrated on identifying overlapping communities in online social networks by examining the weighted online social network of email communications from Enron [57].Two measures, node betweenness and probability, were used to measure the overlapping influence for nodes.If the probability lay between 1% and 5%, then it became a top influencer node.This supported the hypothesis proposed by Bhat and Abulaish [12]: the more a community overlaps, the greater the individual's influence in the whole network.In addition, the outlier nodes were considered noisy nodes, which proved to be non-influencing nodes when the probability reached 60%.Bhat and Abulaish [12] attributed this to the lack of applying the concept of community detection and its attributes to viral marketing.Meng, Zhang, Zhu, Xing, Wang, and Shi [58] have proposed an incremental density-based link clustering algorithm for community detection in dynamic networks called iDBLINK.This algorithm directed to solve the problem in dynamic social network.It ansured its accuracy and efficiency.
4) Evolutionary Methods: Evolutionary or intelligence methods have been widely integrated into several domains because of their effective and optimized consequences.Applying intelligence methods to online social network analysis has had a large impact in terms of social influence.Swarm intelligence methods have been used to analyze social networks; a genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) were applied to maximize the spread [11].The study of Gui-sheng et al. [11] selected optimal seeds.GA and DE were used to increase the number of active nodes, while PSO focused on recording the best node position and speed within the population.In the chain of evolutionary computation and swarm algorithms, the ant colony optimization (ACO) algorithm as a swarm intelligence method has been applied to the competitive maximization problem [59].ACO aims to select a set of nodes and links that maximizes influence by increasing the spread within the network.The algorithm starts by selecting random nodes with an initial pheromone value and then creates solutions corresponding to each node.The process continues iteratively until all the ants probabilistically create solutions.The limitation of this algorithm is that it depends on a random generation of node probability and pheromones.An ant algorithm, a type of intelligence method, was used to analyze social networks to stimulate the spread of opinions [60].Enhancing the marketing strategy was another purpose of this approach.The approach concentrated on three steps: understanding the network structure, mining entities' opinions, and applying the ant-mining algorithm to formulate opinion rules.

IV. SOCIAL MEDIA MARKETING IN BUSINESS
A high proportion of studies in the field of social marketing and especially viral marketing are concerned with business-toconsumer (B2C) business.Constantinides [61] claimed that the development of information technology and communication has kept the customers in control of marketing communication because they are involved in the product development lifecycle.As recent marketing research has focused on the effects and potential of marketing through social networks, Constantinides [61] introduced a classification model for social media as a marketing tool.The model consists of passive and active approaches, where the former depends on the customers' public voice and the latter depends on direct communication with the customers.Social media platforms have become an effective way to reach customers from different levels, but they need enhancements in targeting suitable individual customers and www.ijacsa.thesai.orgbusinesses.Jussila, Kärkkäinen, and Aramo-Immonen [62] mentioned the misconception that social media marketing is directed to the B2C market.They found limited social media use among customers and partners in B2B companies; therefore, their article focused on B2B marketing.They referred to two surveys directed to chief executive officer (CIO) and strategic marketers to measure the adoption of social media in B2B, and they found that the respondents had contrasting perspectives regarding social media in B2B.This is because of the unclear perception of the status of social media.The main difference between social media marketing in B2C and B2B is the customer type.In B2C, customers are ordinary people seeking to know other customer reviews and to have special treatment.In B2B, on the other hand, organizations and companies are looking for direct contact and special offers.
Recent research has shown a lack of models and frameworks for B2B social media marketing [63]- [65].In particular, [64] listed the most influential factors affecting social media marketing for B2B businesses.Although marketing executives are apprehensive about adopting social media marketing for B2B businesses [63], computerized or automated systems will have a significant impact on the overall process [65].

V. VIRAL MARKETING INFORMATION DIFFUSION MODEL
The previous sections presented a general background of two fundamental concepts related to information diffusion: social networks and viral marketing.This article addresses the state of social media marketing in business and sheds light on recent work on diffusion in online social networks for viral marketing.These works are classified according to the method used: synchronous, asynchronous, influence maximization, and social network mining.There is a huge gap in developing viral marketing methods directed to business.This article also addresses the limitations of some information diffusion models for viral marketing.This section will focus on the challenges and problems related to diffusion in online social networks for viral marketing; these will serve as input to build the VMID model.The taxonomy shown in Fig. 3 presents the four main research dimensions of information diffusion in viral marketing, the related limitations, and areas for improvement.

A. Challenges In Information Diffusion For Viral Marketing
There are some general challenges related to online social networks that affect the process of diffusion for viral marketing.Noisy, and incomplete data are the most common challenges in the online social network structure [66], [67].Gatti et al. [68] listed major challenges to diffusion in online social networks, such as collecting real-world samples, recognizing user behavior patterns, and large-scale simulation.Most of the discussed methods and algorithms were applied to smallscale networks (see Table II); therefore, large-scale networks still pose a challenge and can lead to new research trends in the coming years.Kleinberg [69] outlined a set of challenges related to large-scale social networks, including the inference of social processes from data and the problem of maintaining individual privacy in studies of social networks.
Table II shows that the majority of methods and algorithms are applied to static social networks.Static social networks are easier to use in experiments and testing than dynamic social Fig. 3: Taxonomy for information diffusion in viral marketing, related limitations, and new insights.networks.Dynamic, continuous, and time-variance testing are needed for more iterations of simulation.Guille et al. [15] discussed technical and crawling API limitations.There are also some challenges related to social network mining.Aggarwal [70] stated that the challenge was mining the linkage behavior of the social network.Jones and Liu [71] presented a mixture of challenges in mining social networks, such as sentiment analysis, trust prediction, privacy and vulnerability, and user migration.
The problems have also been extracted from the observations gathered in Table II.The most notable point is the dominance of the LT and IC models in the majority of the reviewed work on information diffusion.An attempt has been made to enhance the LT and IC models to maintain their fit into the time-variant application, but only to a limited extent.It is a preliminary attempt that needs further improvement.Influence maximization has been studied in a variety of directions.A few suggestions for its enhancement could be made, such as looking for optimality or fast influence identification selection.Influence maximization was applied in static, dynamic, largescale, and small-scale networks and achieved optimal or at least good results in experiments and simulations.Diffusion has not been addressed in most of the existing studies that use mining methods.Much of the research on social network mining focused on static social networks.Community detection methods have attracted widespread interest in covering overlapping and hierarchical communities.Clustering was typically used in seeding and identifying the greatest number of influencer nodes; however, a current major focus is on basic clustering methods such as k-means and cliques.Classification, on the other hand, has only used in mining social networks for diffusion as a method for organizing user behavior or properties before the diffusion process occurs.Although data mining is considered a powerful method, there remains a need to test and apply the rest of the mining methods.For decades, data mining methods have effectively analyzed noisy, incomplete, and unstructured data, especially in large data collection, which is the main property of social networks.Time series, association rule mining, and prediction are among the data mining methods that can produce a good analysis.www.ijacsa.thesai.orgAsynchronous [32], [34], [35] AsLT, AsIC Yes Yes Yes [36] T-BaSIC Yes Yes Yes [37] CT-IC Yes Yes Yes [30] CTMC-ICM Yes Yes Yes Influence Maximization [7] Approximation Algorithm Yes Yes Yes [14] The Degree-Based Scaling Method Yes Yes Yes [38] Expert and Influential Leader Discovery Approach Yes Yes Yes [41] MaxG Yes Yes Yes [40] Hybrid Each has its own aims and methods that will fit into the different steps of diffusion for viral marketing and will ensure the optimality of the overall process.
Evolutionary methods have been more than adequately studied in relation to the diffusion optimization problem.Swarm intelligence, genetic, and heuristic-based algorithms have ensured valuable consequences for the diffusion process.Integrating them with other methods leads to good results in terms of time.Most of the evolutionary research has dealt with huge data collection and has found that evolutionary methods are the best choice.

B. The VMID Model Structure
Addressing all the aforementioned issues and challenges requires an information systems model to maintain the effectiveness of viral marketing diffusion.Findings from the literature indicate the benefit of social media marketing in businesses.Viral marketing as a method based on social media marketing will have better outcomes compared with social media marketing because of its unique procedure.The purpose of the VMID model (Fig. 4) is to address the problem of information diffusion for viral marketing directed to industrial B2B businesses.This model attempts to fill the gap between businesses and social media marketing by adopting viral marketing.It consists of three main parts: business producer, marketing message diffusion, and consumer.The majority of core processing within this model resides in marketing message diffusion because it consists of computational processing that aims to increase the effectiveness of marketing.Normally, the message starts from the business producer, who initiates the marketing campaign.Marketing executives are responsible for formulating the message content and ensuring its structure [70], [71].The model requires a suitable keyword describing the campaign to match it with the appropriate cluster in the next step.The following subsections will describe each part.

1) Social Analysis:
A pre-processing step is required to maintain the targeted social media.This stage consists of two correlated steps.The first step is collecting the dataset from the selected social media.It is important to note the problem related to real-time processing [15], [72]; social media owners are still placing restrictions on collecting real-time datasets.The second step is analyzing the collected dataset.This step requires the classification process to match the suitable cluster for further processing in the next step.The matching process consists of knowing the marketing message and the objective of diffusion.After these steps have been taken and the determined cluster is ready, it will be sent to incremental clustering, which is included in structure modeling in the next module.
2) Structure Modeling: Utilizing incremental clustering for structure modeling deals with the dynamicity of the social network.As discussed in Section V, most problems related to information diffusion are relevant to dynamic and largescale networks.Introducing incremental clustering solves these two issues because data mining concepts are effective in largescale data and adopting incremental clustering will handle the dynamicity problem.
The processing in this step receives the collected and analyzed data from social analysis.Then, incremental clustering is performed over it in real time.Incremental clustering has proven its value in terms of cost, space, and time [73]- [75].The basic idea behind incremental clustering is processing one sample at a time.Takaffoli, et al. [76] introduced a framework for incremental local communities in dynamic networks.However, their work only considered updates from historical sampling, which is ineffective and time consuming in processing.The added value of the VMID model is combining the activity network with incremental clustering; therefore, every newly updated network is passed to activity network processing.In particular, the activity network can be considered as analyzing user interaction in social networks and microblogging websites.Chun, et al. [77] examined the interaction between users in a guestbook by tracking users' comments.They proposed an activity network where users are the nodes and comments represent the links.The link is constructed only if two users exchange comments.Similarly, Wilson, et al. [78] studied user interaction by proposing an interaction graph that models the users' interaction and communication instead of focusing only on social link relationships.The proposed interaction graph consists of all the nodes existing in the examined social graph.However, the key point of the interaction graph is the link formation, which only contains the links between nodes that interact through communication or an application.The interaction graph was examined using data derived from the Facebook network.The aim of producing the activity network is to optimize the spreading process by excluding the least important users who are inactive in the social network.This reduces the time and cost required to spread a certain message.The resulting network will be saved in the system storage for a matching process between message keywords and cluster keywords.
3) Diffusion Modeling: The concept of diffusion was inspired by natural phenomena such as diseases, waves, fluid, and water.For information diffusion models, this article proposes the adoption of the statistical concept "stochastic differential equation" (SDE), which has been systematically examined for its effectiveness in different domains including engineering, applied mathematics, and computers [79].SDE was chosen because the platform for marketing diffusion evolves over time, and SDE handles problems with similar objectives.The general mathematical definition in (2) is as follows: dX t = b (t; X t ) dt + σ (t; X t ) dB t (2) where B t is the standard Brownian motion and b and σ are given functions of time t and the current state x.
A stochastic real-valued process (X t ) t0 is said to be a diffusion process if it satisfies the following conditions: 1) (X t ) t≥0 is a Markov process.
2) The following limits exist: a) 3) X(t) t≥0 is a continuous process b (t; X t ) is called the drift (coefficient, parameter) and σ 2 (t; X t ) is the diffusion (coefficient, parameter).The definition of the diffusion process suggests a relationship between drift and diffusion, as shown in (2).

VI. CONCLUSION
Research in the field of information diffusion for viral marketing has increased enormously in recent years.This article, to the best of our knowledge, is the first to examine studies related to information diffusion in viral marketing.This article discusses some of the challenges and issues that can be used to future research.It also addresses the gap in social media marketing by introducing the VMID model.The proposed model has based on incremental clustering and SDE.It attempts to adopt information system modeling for marketing in B2B business.In industrial markets, it is a challenge to align with a systematic method that facilitates the connection between partners.The VMID model requires more experiments and testing to ensure its validity in the real world.www.ijacsa.thesai.org

TABLE I :
Differences and similarities between LT and IC models

TABLE II :
Summary of Existing Diffusion Models for Viral Marketing