Combining Multiple Classifiers using Ensemble Method for Anomaly Detection in Blockchain Networks: A Comprehensive Review

—Blockchain is one of the most anticipated technology revolutions, with immense promise in various applications. It is a distributed and encrypted database that can address a range of challenges connected to online security and trust. While many people identify Blockchain with cryptocurrencies such as Bitcoin, it has a wide range of applications in supply chain management, health, Internet of Things (IoT), education, identity theft prevention, logistics, and the execution of digital smart contracts. Although Blockchain Technology (BT) has numerous advantages for Decentralized Applications (DApps), it is nevertheless vulnerable to abuse, smart contract failures, security, theft, trespassing, and other concerns. As a result, using Machine Learning (ML) models to detect anomalies is an excellent way to detect and safeguard blockchain networks from criminal activity. Adapting ensemble learning methods in ML to create better prediction outcomes is a viable approach for anomaly identification. Ensemble learning, as the name implies, refers to creating a stronger and more accurate classification by combining the prediction results of numerous weak models. As a result, an in-depth evaluation of ensemble learning methodologies for anomaly detection in the blockchain network ecosystem is applied in this paper. It comprises numerous ensemble methods (e.g., averaging, voting, stacking, boosting, bagging). The review collects data from three established databases, which are Scopus, Web of Science (WoS), and Google Scholar. Specific keywords are employed, such as Blockchain, Ethereum, Bitcoin, Anomaly Detection, and Ensemble Learning, employing advanced searching algorithms. The results of the search found 60 primary articles from 2017 to 2022 (30 from Scopus, 20 from the WoS, and 10 from Google Scholar). Based on these findings, we decided to divide our debate into three primary themes: (1) the fundamentals of Blockchain Technology (BT), (2) the overview of ensemble learning, and (3) the integration and analysis of ensemble learning in blockchain networks for anomaly detection. In terms of awareness and knowledge, the results are also discussed in terms of what they mean and where future research should go.


I. INTRODUCTION
Nowadays, most agencies have started evaluating Blockchain Technology (BT) in various sectors such as pharmaceuticals, automotive, agri-food, livestock, supply chain, health, and government digital initiatives [1]. This scenario has an impact in the context of traceability, transparency, and trustworthiness values in distributed and decentralized ecosystem environments [2]. A Blockchain operates based on a data structure storage method consisting of blocks that are interconnected with each other using a cryptography hash mechanism. Technically, each block stores information such as timestamp, Merkle root, nonce, previous hash and difficulty in the block header [3]. From the point of view of decentralized Blockchain applications, the world of cryptocurrency has become popular and dominant. Thus, Bitcoin BT has forged success by producing the first cryptocurrency application. It is different from Ethereum, which introduced smart contracts, and Ether has been declared the second largest cryptocurrency after Bitcoin [4]. Additionally, Ethereum was created to address the Bitcoin protocol's functional insufficiency [5]. Technically, the Ethereum network hosts smart contracts, which are collections of code that run on the Blockchain and carry out a set of instructions. These contracts are what power Decentralized Applications (DApps), which are akin to smartphone apps that operate on Google (Android) or Apple (iOS) operating systems.
In a public blockchain network, all transactions are transparent and are publicly available. Hence, anyone in the network can examine these transactions and may cross-verify any fraudulent behavior. Along with its rapid development, BT has encountered several security issues and shortcomings, including majority attacks, forking, and bugs in smart contracts. Wallet attacks, Ponzi Schemes, Proof of Work (PoW) vulnerabilities, and crypto-jacking are all challenges that need to be addressed. For instance, the Ethereum Blockchain has increased in prominence. Nevertheless, it has been beset by security vulnerabilities such as phishing scam, which has accounted for nearly half of all criminality on the platform since 2017 [6]. Therefore, for an efficient functioning of a blockchain network, it is vital to detect these vulnerabilities in the most precise and timely manner. To enable the successful identification and prediction of such attacks over Blockchain, the field of anomaly detection models in the Machine Learning (ML) method for Blockchain comes into play.
In general, an attempt to detect an anomaly in a pattern or thing that is different from the norm is termed anomaly detection. [7]. This demonstrates that combining ML and BT has a good impact and is widely employed in industries such as automotive, health, decentralized finance (DeFi), supply chain, agriculture, and the Internet of Things (IoT). Both technologies are combined for goals such as detecting suspicious activity, cybercrime and fraud. Besides, a www.ijacsa.thesai.org Blockchain system that can handle massive data sets is compatible with ML approaches to data analysis and can increase data security [8]. Therefore, a huge variety of anomaly detection models are being designed and deployed by researchers for various Blockchains. However, one of the most difficult aspects of detecting fraud on the Blockchain is that it is anonymous [9].
Overall, it is necessary to note that anomaly detection is one of the important areas for protecting future blockchain networks and that a considerable amount of work is being undertaken on this subject from many views, which will be described in this paper. Ensemble approaches are prominent ways of increasing the prediction capacity of an ML model for anomaly detection. In theory, ensemble learning techniques use multiple classifier methods to improve experimental outcomes. Conventional methods that use a single classifier to perform predictive analysis are ineffective. Therefore, combining individual classifiers in an ensemble can produce higher accuracy values [112]. For instance, strategies include stacking, averaging, bagging, and boosting approaches [10].
This research focuses on the fundamentals of BT, ML classification, and the combined contribution of ML and Blockchain to detect irregularities utilizing ensemble techniques. To aid comprehension, the study is divided into three sections: (2) Blockchain principles, (3) an overview of ensemble learning classification, and (4) developing the ensemble learning method for anomaly detection in blockchain networks.

A. Overview
Blockchain is presently one of the most promising technology trends, with great possibilities across many useful applications. It is basically a distributed and encrypted variation of a database, which can solve several difficulties connected to online security and trust. As a result, the Blockchain feature of securely and decentralized data management makes Blockchain known in the world of cryptocurrencies such as Bitcoin and Ether (Ethereum). Historically, the goal of producing interference-proof texts led to the development of a cryptographic hash formatting system for storing documents in a chain of blocks [11]. In this endeavour, hash-based cryptographic algorithms are used to store a collection of verified documents in Merkle tree format in each block [12]. Moreover, since it was invented and exploited in cryptocurrencies like Bitcoin, which was presented by Nakamoto [1] in 2008, this technology has become well-known. This has helped popularize Bitcoin as the first digital electronic payment mechanism that operates on a peer-to-peer (P2P) basis and in a decentralized ecosystem. The field of Blockchain has been divided into four categories: Private, Public, Hybrid, and Consortium. Fig. 1 depicts the categorization of Blockchain. A Public Blockchain is non-restrictive and permissionless [13]. This means anyone can do the mining process, and these transactions involve the addition of new blocks settled through a consensus mechanism. This concept has been fundamental to the existence of Bitcoin and other cryptocurrencies in the Distributed Ledger Technology (DLT) ecosystem [1]. As a result, the weakness of the centralized operating system faces challenges in terms of low-security level and no value of transparency (dependence on third parties). Regarding data storage, the DLT ecosystem stores data distributed across nodes linked by a blockchain network, as opposed to a centralized system that stores data in a single location. Technically, the consensus mechanism is an important algorithm in Blockchain operations to ensure that members joining a blockchain network agree on certain conditions before the ledger is updated. Proof of work (PoW) is a common consensus algorithm used in Public Blockchain environments. One of the benefits of this consensus is that as the number of miners grows, attacks can be reduced to 51 percent [16].
In contrast to the Public Blockchain, the Private Blockchain operates based on an organization through access granted only to be allowed to enter the network. Therefore, they are also called "permissioned blockchains" or "business blockchains" [17]. It has the same properties as a Public Blockchain that is distributed, decentralized, and operates in a P2P environment. Typically, a Private Blockchain is used in a network environment with a small organization compared to a Public Blockchain, where anyone has the right to enter a public network. The consensus algorithm used in the Private Blockchain (permissioned) is Practical Byzantine Fault Tolerance (PBFT).
Using both Public and Private Blockchain features in Blockchain development is necessary in the real world. As a result, a Blockchain ecosystem known as Hybrid Blockchain [18] has emerged. Elements from the Private Blockchain (permissioned) are employed in the enterprise context. On the other hand, a Public Blockchain is ideal for practice since the data requirements are open or public (permissionless). The addition of the participation of several organizations from a single organization so that the value of collaboration is higher in a Private Blockchain environment is termed a "blockchain consortium" [18]. It combines features of a Public and Private Blockchain and is very similar to a Hybrid Blockchain. An important goal is to eliminate access gaps limited to a single organization in a Private Blockchain environment.

B. Blockchain Architecture
In a decentralized ledger, all transactions in a Blockchain are stored in interconnected blocks. Each block contains a block header that stores critical information, including the timestamp, nonce, difficulty, block hash, and Merkle root tree, to keep these blocks related. This method guarantees the security of the data within the blocks, and the size of the witness determines the size of each block. One Bitcoin block, for example, is 1 MB in size [1]. Meanwhile, the Merkle tree employs the hash technique for each block transaction, as shown in Fig. 2. From an operational point of view, each block stores the address of the parent block or the previous block in the form of a hash value. This mechanism can help to identify the chain sequence between these blocks. Blocks generated in the early stages of blockchain network construction are termed "block genesis." To ensure the uniqueness of each block, the timestamp information is crucial to store the time differentiation generated on each block. For example, the current block has a more recent timestamp value than the timestamp of the previous block. This mechanism can prevent the occurrence of double-spending cases.
Blockchain environments, especially Bitcoin, are known for mining processes using pseudo-random numbers (nonce) and are used only once throughout the mining process. Note that it is difficult to keep the value of the difficulty level based on a threshold with a specific target. For example, the difficulty level rises when the number of transactions increases. As a result, block formation becomes increasingly complex (mining process) and slower. It also affects cyber attackers and greedy miners who want to take advantage of many transactions and slow the processing. The Merkle tree cryptographically manages the hash mechanism on transactions in blocks. This is described as a tree consisting of leaves as well as twigs. Conceptually, the hash in the brand tree is constructed based on a combination of left and right hashes to produce the parent hash. The generation of interconnected hashes forms a chain called a Blockchain. Therefore, an abnormality in the Merkle tree indicates something is happening in the chain, and appropriate action is taken immediately [19].

C. Blockchain Layers
From the Blockchain Technology (BT) layer perspective, there are six layers in the blockchain network, as depicted in Fig. 3. The blockchain network contains several layers to execute specialized activities [20,21]. The data layer provides cryptographic techniques that store data in the hash, Merkle tree, and timestamp value forms in both on-chain (Blockchain) and off-chain (database) settings. The network layer manages all of the nodes in the blockchain network. At the network layer, this level of security and privacy is made sure to stay in place by a decentralized P2P environment. At the same time, transaction consistency is managed by consensus mechanisms located at the consensus layer. The mining process rewards successful miners. It is managed in the incentive layer. The condition of the smart contract in the Blockchain ecosystem is important to ensure that the security aspects are guaranteed, bug-free, and free from any vulnerabilities. Therefore, the smart contract programme is implemented at the contract layer. The application layer, which connects the end-user to the blockchain network, is the final layer. This layer comprises Blockchain applications (Decentralized Applications (DApps)) that were designed and constructed based on the business case in various sectors.

D. Consensus Algorithm
The blockchain network must verify each software's ledger for consistency and clarity. This is performed by a few steps that follow certain rules during the transaction process. The verification process is carried out decentralized, with transactions completed in a distributed environment managed by P2P-connected nodes in the network. The approaches or algorithms utilized to reach a consensus are called consensus algorithms. Fig. 4 shows various widely used consensus algorithms, including Proof of Authority (PoA), PoW, Proof of Stake (PoS), and PBFT. Each node seeking to participate (mining) in the PoW consensus process must contribute resources by completing mathematical problem challenges [14]. This problem has a different level of difficulty. It is a consensus technique used in Bitcoin [1] and Ethereum [22]. In PoS, only one miner can generate new blocks from all participating nodes, while other miners waste incentives and energy resources on the blockchain network [15].
As a result, PoS works better when only those nodes can verify that their shareholders are permitted to participate. It avoids the circumstance where one node owns the network since no single node may hold 51 percent of the network's money [23]. As a result, PoS can efficiently cut energy consumption and reduce the number of miners, and the transaction speed can be boosted compared to PoW. It is critical to obtain mutual understanding in the PoA consensus to ensure the transaction is valid. The node's blocks must be certified by the verified node, and the process continues through the successive rounds as planned [24]. The PBFT consensus refers to a Byzantine military analogy that is difficult to reach consensus if no nodes have reached an agreement. The effort to reach this agreement based on the leaders with the most weight is called the PBFT consensus [25].

E. Bitcoin
Cryptocurrency is one of the most extensively used Blockchain applications today and is used worldwide. Bitcoin and Ether (Ethereum) are two digital currencies commonly used in the crypto realm. Satoshi Nakamoto was the first to introduce Bitcoin, successfully solving the double-spending problem while introducing digital currency use [1]. The Blockchain controls each transaction using a cryptographic process based on hash values on input and output sets from an operational standpoint. Only one input transaction from the whole blockchain network is used to generate the output [26].
Aside from that, Blockchain is linked to a P2P ecosystem for transaction management and network ownership. The decentralization of Blockchain is a clear distinction between traditional databases and Blockchain. This implies that each network node is accountable for storing a copy of the ledger [19]. In the Bitcoin ecosystem, anyone can participate in the network. This feature is why Bitcoin is known by the term "incentive" or "reward" through the PoW consensus given to miners who successfully perform the mining process. As such, this Blockchain operates in a decentralized manner, which means it does not require a centralized body compared to traditional financial systems, which are centralized in nature. In this process, the miner gets paid a few Bitcoins after completing the operation.The mining process is secure because it involves hashed and encrypted transactions using the SHA-256 cryptographic technique. The popularity of Bitcoin as a Blockchain application for managing cryptocurrencies has prompted the development of several other crypto and DApps.

F. Ethereum
Buterin's paper [27] launched Ethereum and solved various problems with Bitcoin's scripting language. Ethereum had added transaction list and state information in the block header compared to before, which only contained information such as nonce, difficulty, and block number. A new state will be formed based on the previous state in the transaction list. The notable difference between Bitcoin and Ethereum is the cryptographic protocol used. Ethereum uses Keccak 256 bits while Bitcoin uses SHA-256. Thus, the header block in Ethereum consists of hashes for gas fee information, timestamp, parent block header, root state, and additional hashes for verification process purposes [28]. Ethereum provides a decentralized ecosystem for developers to develop products using the Solidity language and Ethereum Virtual Machine (EVM). The Solidity language is used to develop smart contract programmes based on business cases to be executed and converted to byte code in EVM [26].

G. Smart Contract
Historically, the idea of contract management has traditionally inspired the introduction of digital smart contracts by the founder of smart contracts, Szabo [29]. The main purpose of digital smart contracts is to automate traditional contract management. This smart contract is referred to as computer technology with the help of writing programme code to be implemented to automate the contract process. For operational purposes, smart contracts are integrated with Ethereum to be executed and stored in a decentralized ledger. Recently, the use of smart contracts has been widely used in conjunction with BT in various fields [61,62]. Furthermore, the EVM environment and the Solidity programming language facilitate the development of smart contracts within Ethereum. This development has also attracted researchers to explore smart contracts on the Blockchain.

III. ENSEMBLE METHOD
Machine Learning (ML) algorithms have been widely applied in both supervised learning and unsupervised learning situations to construct systems capable of making realistic decisions in light of past data. Numerous classification-based ensemble methods have been developed to boost the accuracy of supervised Learning Algorithms (LAs). Therefore, ensemble methods are prominent solutions for boosting the prediction capacity of an ML model. In the competition aspect, the ensemble approach has succeeded in several ML model competitions in which it has participated. For instance, the winner employed an ensemble method to create a robust collaborative filtering algorithm in the popular Netflix Competition [30]. Another example is Knowledge Discovery in Databases (KDD) 2009 when the winner also used ensemble methods [31].
Conceptually, the ensemble approach combines several trained individual classifiers to produce a new classifier. Typically, these individual classifiers are termed weak learners, and their ensemble combination aims to make this model stronger in terms of accuracy. However, among the challenges of using the original model individually is exposure to high variance and bias factors. Therefore, the ensemble strategy can reduce the bias and variance gaps to produce new combinations with better performance results, as illustrated in Fig. 5.  With reference to Fig. 6, the process of ensemble generation from data input takes place in the first phase to produce weak learners. Next, the pruning and cleaning process is done for the weak learners. Finally, the combined integration of the weak learners is implemented in the last phase using the selected model. Past research has proven the ensemble approach successfully produces more accurate study results and lower false positive (FP) metrics than individual classifiers. The study also shows that popular ensemble strategies are stacking, bagging, and boosting. The authors [10] has described the ensemble as a variety of combined approaches consisting of the voting method, the averaging method, the stacking method, the bagging method, and the boosting method. According to [32], the ensemble approach can address the shortcomings of traditional ML, such as mathematical, computational, and representation problems. Fig. 7 depicts the ensemble learning methodology and methods. Moreover, the authors explain an ensemble as a model that incorporates the results from numerous other models to remedy the flaws of every situation. Most of this strategy's options can be classified as bagging or boosting [33]. In the averaging approaches, the authors [34] tests with different alternatives of anomaly detection models. The authors believe that choosing a simple average score between different algorithms is a simple and successful solution. Apart from that, the authors define combining the multiple models as needed because they address the problem from diverse aspects [34]. Using ensemble learning, the combination of Random Forest (RF), Extra Trees, and Bagging classifier demonstrated a possible performance by gaining the predictions based on averaging the probabilities derived from these methods [35]. The authors [36] describe how the results generated from the individual classifiers have enhanced their capabilities and have shown improved performance on the study results through the ensemble method. Meanwhile, the study by [113] used a Deep Learning (DL) approach to produce prediction analysis with an ensemble combination for a single classifier based on medical datasets. The study results show that the ensemble technique produces high accuracy values compared to the individual classifiers. Nowadays, more studies lead to new methods or techniques for model optimization compared to before, which is more to developing new models. Among them is a study conducted by the authors [114] using ensemble techniques to develop a new model optimization method for the prediction of taxological applications. The experimental results in this study show that the ensemble technique produces better results than the single classifier.

A. Voting
Voting is the easiest ensemble procedure. Among the main techniques of the voting ensemble is the majority voting ensemble, sometimes called the max voting ensemble. This is an ensemble strategy that combines multiple different types of individual classifiers. The desire to increase performance from individual models is an essential strategy. For classification and regression, ensemble voting might be used. The mean value of the forecast is derived using the regression approach. In the classification approach, labelling is based on the number of prediction outcomes tagged and the majority of votes. In practice, ensemble voting is appropriate when all individual models show good performance. Fig. 8 illustrates a voting ensemble learning illustration. In a study by [37], the majority voting-based ensemble model method was used. The results successfully detected network traffic as if there had been an attack on the Intrusion Detection System (IDS). In this research, the authors [37] mentioned that many classifiers were employed for training and testing, and final findings were attained utilizing the voting approach. Aside from the majority vote approach, the researchers chose to perform the investigation using the weighted voting method. Repeated calculations on the model prediction are used in the weighted voting method to produce a favourable result from the standpoint of the ballot weights. In the current work, weighted majority voting was used to categorize the data, where Particle Swarm Optimization (PSO) was employed for allocating weights to several classifiers [37].

B. Averaging
Using the averaging method, the simplest strategy for making predictions from dataset inputs is based on average values. In general, this method generates a better regression model and reduces overfitting. Nevertheless, this averaging variant is slightly modified to be a weighted average model. The prediction generated from this model is calculated based on the average value generated from the multiplication operation by the weights on each model. Rank averaging is the process of allocating ranks to individual models based on the weight to be assigned to each model. The method of averaging and determining the maximum score is one of the combination methods that can be used. The findings of the pilot experiment reveal that weighted averaging has been utilized to normalize the anomaly scores. This is done before combining the method to balance the results of unbalanced for different algorithms with different datasets [38]. The weighted average is the result of the study's final output based on the method of grouping the list of scores and assigning a weighting value that is inversely proportionate to the group size possessed by each list of scores, according to [39]. Fig. 9 illustrates the average ensemble learning demonstration.

C. Stacking
Stacking, or layered generalization, is an alternative way of integrating numerous models. In the stacking technique, various individual (multiple) models have been integrated. Among them are logistic regression (LR), Naïve Bayes (NB), and Decision Tree (DT). The learning approach of stacking is for merging the expectations of several classification models into a single meta-classifier [31]. Meanwhile, the authors [40] explained that stacking techniques in the ML approach could produce a more powerful model. This is implemented through training on datasets on individual models to improve accuracy. Basically, the stacking method uses the predictions made by a single model to make another model.
From an operational point of view, the stacking technique is carried out sequentially. The process begins by training several selected individual models using a dataset sample. Subsequently, the production probability results from each individual model go through a fine-tuned process before being combined into a final model. This procedure is performed repeatedly depending on the number of stacking layers you want to use. Finally, the final output is formed based on the final output generated by several individual models in the last layer. Therefore, the individual models generated at this end layer are known as meta-classifiers. According to [41], the learning output at the base layer determines the final output produced by the stacking method. Fig. 10 depicts the usual two-layer stacking modelling approach.

D. Bagging
The bootstrap aggregating (bagging) was first described in [43]. It is one of the simplest ensemble approaches and is best suited for issues involving small training datasets. Sequential and parallel ensemble methods are the two predominant paradigms for constructing ensemble models. Technically, various series of datasets are formed through random extraction from samples of the original data set, and these data sets are used to train different models. Then, voting is used to aggregate the results of the models to form a single output. Bagging is used in regression and classification to improve the precision of ML algorithms. Besides, bagging also utilizes the most prevalent techniques for combining the outputs of base learners, namely averaging for regression issues and voting for classification tasks. Among the algorithms commonly used in the bagging technique is the DT. According to [44], this algorithm can be compatible with weak models and have high variance. However, apart from the DT, other model classifications such as K-Nearest Neighbour (KNN) and NB are also used in the bagging technique. Furthermore, creating a model using a simple method that incorporates large and complex data is impossible. Consequently, bagging approaches are ideal for managing both high-dimensional and large-capacity data. Fig. 11 depicts an illustration of the Bagging algorithm procedure.  The Random Forest is, as its name suggests, a forest comprised of numerous trees. In general, RF (Tree-Based) use a DT as an individual model, which generates a set of random parameters as the value of dependence on each tree. Similar to other ensemble algorithms, RF produces predictions by combining numerous separate models. Basically, the RF procedure consists of multiple steps. First, bootstrap samples were randomly generated from the dataset. Then, the prediction results of each tree will be obtained from the construction of the DT based on the data sample. Lastly is the implementation in the voting phase to produce the final output. In this last phase, the model that gives the most accurate prediction results will be selected [45].
2) Isolation Forest (IF): The Isolation Forest (IF) algorithm was first proposed in 2008 [46]. Like any other tree ensemble method, this approach is based on DT. It operates on the premise that an individual who is easier to distinguish from others in a random sub dataset of the feature space must be an outlier. It begins by drawing a random sample from the dataset and selecting a random dimension. Correspondingly, a random value within the range of that dimension is selected to precisely divide the sample into two pieces. Next, the root node of a tree is built using the selected dimension and splitting point. Further nodes are produced recursively for subsamples until a subdivision is impossible or an arbitrary tree depth is attained. In this tree, a point closer to the root node correlates to a situation more likely to be isolated. Nevertheless, this could be due to random chance. Therefore, the entire tree generation technique is repeated for additional samples until the necessary number of trees is achieved. Note that the anomaly score is computed using the mean traversal path length of the trees. The authors of [46] claim that their algorithm is superior to other alternatives for addressing masking difficulties (clusters of anomalies) and swamping problems (mistakenly identifying normal situations as being surrounded by anomalies).

E. Boosting
Boosting is a strategy for enhancing the performance and accuracy of the ML approach by transforming weak base learners into strong ones [47] as shown in Fig. 12. The fundamental premise of the boosting strategy is to sequentially add new models to the ensemble. In general, the boosting technique generates a sample of training data randomly with the replacement of the main dataset sequentially. In this procedure, a sequence of models is learned. The process begins by providing training on the weak model using a training dataset to produce a second model after fixing the weaknesses in the first model. Subsequently, a third model was produced that overcame the weaknesses of the previous two models. This process will continue until all the mistakes are fixed and the final model is made. Last, a technique weighted majority voting was used to build the final model from the weak model [48,49]. Boosting techniques have been proven to increase accuracy and reduce bias and variance. Among the algorithms widely used in boosting techniques are www.ijacsa.thesai.org Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosted Machine (LightGBM).

1) Adaptive Boosting (AdaBoost):
AdaBoost was the first truly successful binary classification boosting method. It was originally referred to by its inventors as AdaBoost.M1 [50]. Recently, it has been referred to as "discrete AdaBoost" because it is utilized for classification instead of regression. AdaBoost, like other approaches, may be used to increase the performance of any ML model and can be used for learners with low intelligence. This strategy works by turning weak learners into strong ones by getting rid of them by correcting their mistakes over and over again iteratively. The weighted training dataset is used to train weak learners in succession. Subsequently, numerous weak learners are joined to become a single powerful learner. Finally, the weight voting method on the weaker model was used to determine the stronger final model [50]. Besides, one-level DT is the best-suited and, thus, the most popular algorithm employed with AdaBoost. Since these trees are so short and contain only one classification decision, they are often referred to as decision stumps.
2) Extreme Gradient Boosting (XGBoost): Extreme Gradient Boosting, or XGBoost, is a scalable ML approach for tree boosting that was presented by Chen and Guestrin [51]. XGBoost is a gradient boosting-based model that uses additional boosting strategies to produce predictions more accurately compared to other gradient boosting models [52]. Therefore, the advantages of this technique have been acknowledged in various fields of ML and data science. For example, a total of 17 winners used the XGBoost technique out of a total of 29 winners to complete one solution contest as well as be featured in the Kaggle blog [53]. XGBoost uses the advantages of boosted tree algorithms to produce accurate and scalable boosting gradients. Moreover, XGBoost has been designed with fast computer processing and improved ML model performance in mind. In general, XGBoost works in parallel to generate trees. This process is implemented level by level to produce predictions on each iteration from weak learners. As a result, each of these iterations can improve the errors of their predecessors. The final result of prediction with a combination of individual models and these mechanisms is the same as with other ensemble approaches.
3) Light Gradient Boosted Machine (LightGBM): LightGBM, or Light Gradient Boosted Machine, was described by Guolin Ke et al. in 2017 [54]. LightGBM is a gradient boosting implementation aimed to be efficient and possibly more successful than previous gradient boosting implementations. According to the authors [54], the solution includes two main concepts: 1) Gradient-based One-Side Sampling (GOSS); and 2) Exclusive Feature Bundling (EFB). GOSS is a variation on the gradient boosting approach that prioritizes training samples that provide a greater gradient, accelerating learning and minimizing the method's computing complexity. In contrast, EFB is a method for combining sparse (mainly zero) mutually exclusive features, such as one-hot encoded categorical variable inputs. Consequently, this is a form of automatic feature selection. Through this concept, LightGBM has adapted a tree algorithm capable of producing high performance, classification, ranking, and various tasks in ML. Besides, LightGBM is a fast, more efficient, less memory-intensive, more accurate than any other boosting algorithm, compatible with large datasets, and gradient boosting framework. Normally, the DT through the boosting method is determined based on their level or depth. Nevertheless, this approach differs from LightGBM, which divides the tree based on the optimal leaf. Therefore, this approach provides a high level of accuracy by minimizing the level of loss and is an achievement that is rarely achieved by any existing booster algorithm.

IV. ENSEMBLE ANOMALY DETECTION IN BLOCKCHAIN
Nowadays, the development of Blockchain Technology (BT) is not just focused on the world of cryptocurrency but its expansion to Decentralized Applications (DApps) in various fields. Following this, the features available in BT have provided advantages in terms of transparency, immutability, enhanced security level, fast transactions, and high privacy. As a result, we see many applications that use BT in various sectors, namely finance, supply chain, halal products, pharmaceuticals, education, government, etc. In cryptocurrency, Bitcoin and Ethereum are the most popular and widely used applications due to their high market capitalization and trading volume. Apart from that, Bitcoin constitutes about 39.53 percent of the market's entire value [55]. At the same time, Ether is the second-biggest cryptocurrency [3]. Meanwhile, Ethereum is the largest and most widely used decentralized Blockchain platform for smart contract adaptation. The widespread use of Bitcoin, as well as Ethereum, has given rise to some critical issues in the aspects of cybercrime and security. As a result, many have become www.ijacsa.thesai.org victims of various frauds, such as phishing and Ponzi Schemes, after detecting more than 10 percent of Initial Coin Offering (ICO) on Ethereum. Generally, the Ethereum blockchain network is a public distributed ledger with around 1.158 million daily transactions [56] and is categorized as big data. Therefore, manually combing through all of these transactions to find any transactions suspected of exhibiting unusual characteristics would be impracticable and interminable. Based on this scenario, Machine Learning (ML) algorithms would help differentiate between transactions that exhibit normal and abnormal behavior among user accounts by learning the attributes that correspond to either normal or abnormal conduct.Therefore, an approach to detecting transactions that show abnormalities was introduced, known as the abnormal detection method. Nowadays, this method is increasingly used in various fields to detect patterns of abnormalities, especially its role in the Blockchain ecosystem. The detection model developed using the ML model helps detect and predict the initial attacks on the blockchain network. Fig. 13 offers data visualization for normal and anomalous transactions to better understand anomaly transactions. Oddities or unusual occurrences have the same meaning as deviations, noise, novelties, exceptions, and outliers [7]. Clearly, the combination of Blockchain and ML technology positively benefits both parties, as shown in Fig. 14. The Blockchain ecosystem is known for its overly large data storage nature and can be declared big data. There is also data from external sources such as smart devices, the Internet of Things (IoT), and external applications that store data in a database (off-chain). Thus, data from various sources is analyzed using ML techniques to produce analytical dashboards, predictions, visualizations, and others that can help with planning, monitoring, and decisions.  In earlier study, numerous ML algorithms have been applied in supervised [57] and unsupervised learning [58] for anomaly detection in blockchain networks. Random Forest (RF) [59], Decision Tree (DT) (j48) [60], Extreme Gradient Boosting (XGBoost) [61], Adaptive Boosting (AdaBoost) [62], secureSVM [63], Light Gradient Boosted Machine (LightGBM) [64], K-Nearest Neighbour (KNN) [65], Support Vector Machines (SVM) [66], Naïve Bayes (NB) [67] and Isolation Forest (IF) [68] are examples of supervised learning models. Among the models in unsupervised learning that have been utilized are One Class Support Vector Machine (OCSVM) [69], K-means [70], Density Based Spatial Clustering of Application with Noise (DBSCAN) [71] and Long Short Term Memory (LSTM) [72]. This article evaluates the ensemble learning method for detecting anomalous or criminal transactions in blockchain networks. Ensemble learning gave good results and great performance in the experiments for recognizing malicious Ethereum entities [73]. Moreover, the authors execute ensemble learning, a mixture of ML predictors that wins over other classical learning approaches at predicting licit and illegitimate transactions. In the experiment, ensemble learning can be characterized as a classification method based on an average probability ensemble constructed from the collection of best-performing supervised learning methods employed in our experiment [35]. However, individual classifiers are troublesome for processing high-complexity data, according to [74] research. Consequently, this issue has been handled by developing a classification model utilizing the ensemble approach. In a Proof of Concept (PoC) development project for the decentralized unmanned aerial vehicle (UAV), the ensemble stacking method was applied to a variety of individual models to assess its predictive accuracy [75].The completed literature evaluation led to the classification of prior research articles about the addressed applications published from 2017-2022. Publications were divided into four aspects: anomaly detection in cybercrime (see Table I), security (see Table II), information processing (see Table III) and smart devices (see Table IV).

A. An Anomaly in the Aspect of Cybercrime
Cybercrime means using computers, tools or materials with the intent to do illegal things [76]. BT's openness, transparency, and immutability have prompted malicious parties to commit criminal activities. Most cyberattacks are performed for financial benefits. In the cryptocurrency era, hackers are prompted to get their ransoms in cryptocurrencies, as it provides the advantage of anonymity and easy transfer across countries. Therefore, among the effective methods is to use ML techniques to detect abnormalities in blockchain network transactions. Many previous studies have reported detecting transaction abnormalities using the approach of the abnormality detection method. Thus, in this review, we identified 31 publications that apply the cybercrime aspect in the selected papers, as shown in Table I. Referring to Table I, cybercrime aspects are categorized according to the type of application case, namely smart contracts, illicit transactions, scams (pump and dump), fraud detection, ransomware, Ponzi Schemes, money laundering, High Yield Investment Program (HYIP), and phishing, as shown in Fig. 15. www.ijacsa.thesai.org As indicated in Table I, an RF was the most commonly utilized ensemble model (based learner) in the chosen research publications. In addition, 20 research publications utilized bagging as an ensemble approach, and 11 research papers embraced the boosting method. Furthermore, we uncovered 24 research articles utilized in Bitcoin and Ethereum. Since developing an ML model relies on the dataset, we analyzed the data source of ML models for anomaly detection applied in the selected research publications. The analysis of data sources has shown that 30 different types of data sets were used in the experiment. In earlier investigations, it was observed that there are numerous ways employed in the ensemble learning method. Among them is combining ensemble approaches or tactics to produce a good result. The review papers describe numerous techniques for this hybrid scenario, including bagging with voting, bagging with averaging, bagging with boosting and bagging with stacking.
The authors [77] suggested a pre-encryption detection algorithm (PEDA) that seeks to identify ransomware using an ML approach to assess and categorize ransomware using the bagging and voting (majority voting) ensemble learning technique. This research was conducted in Phase 1 and Phase 2 using a dataset created by Resilient Information System Security (RISS) from Imperial College, London. Nevertheless, the focus of this study is the focus on Learning Algorithm (LA) implemented in Phase 1. In general, LA works through an ensemble DT approach. First, the simulations of the LA model were implemented using the Application Programme Interface (API) data generated by suspicious software for inspection. Then, performance measurement analysis was performed by comparing the LA model with three other models, namely NB, RF, and ensemble techniques (RF and NB). Finally, this model was selected using the majority voting method. The results of this experiment have shown that the LA model produces better performance compared to the individual models' RF, NB and the ensemble models (RF and NB). Measurement metrics use detection rate (DR), False Positive Rate (FPR), Under Area the ROC Curve (AUC), and test error values. Adapting ensemble techniques has also worked well in networking, where they have been used to predict both licit and illicit transactions [35]. In this experiment, the approach of bagging with averaging technique has been applied to anticipate licit and criminal transactions in the blockchain network. The proposed approach of an ensemble (RF, Extra Trees, and Bagging classifiers) has fared the best with a comparison of RF, Multilayer Perceptron (MLP), and Logistic Regression (LR). In an average probability ensemble, the classification is done by employing numerous pre-trained ML models. The final predictions are formed by averaging the summation of the prediction probabilities received from the LAs. Note that the results demonstrate that ensemble learning is able to execute classification with an accuracy (98.13 percent) and F1 score (83.36 percent) to forecast licit and illegal transactions.
The authors [78] gives a comprehensive evaluation of different supervised ML algorithms, such as bagging models (RF), boosting models (AdaBoost), and others, to prevent fraud. This research concluded that utilizing AdaBoost and RF classifier produced the best performance result among the other seven algorithms.
Feature selection in the ensemble approach plays an important role in producing better results. This has been demonstrated by [74], who conducted studies on the use of feature selection and without feature selection. This simulation is performed by comparing the use of feature selection with that without feature selection in the ensemble classifier (boosting, stacking). The final results have shown that there is an increase in the value of F-Score (7 to 9 percent) and accuracy (2 to 3 percent).

B. An Anomaly in the Aspect of Security
BT does not guarantee freedom from security issues. Therefore, there is a need to establish risk management through a comprehensive cyber security framework and undergo security assessment services to protect against attacks and abuse by hackers. This security issue has been researched and has found a total of 31 research papers involved in the study on the aspect of security, as shown in Table II. This indepth study uses ensemble techniques to find anomalous transactions in a blockchain network. According to Table II, security elements are largely split into backdoor assaults, vulnerability identification, crypto-jacking, under-priced Denial of Service (DoS) attacks, intrusion detection, miner detection, malware, cybersecurity framework, protection of private information, botnet and malicious account detection, and so on, as shown in Fig. 16. www.ijacsa.thesai.org As shown in Table II, an RF was the most commonly utilized ensemble model (based learner) in the selected research publications. In addition, four research publications utilized bagging as an ensemble approach, two research papers adopted the stacking method, and 1 research study applied to boost and to vote. Moreover, we identified four research publications that have been used in Ethereum. The utilization of datasets is the crucial component of ML model construction. Consequently, this study's analysis considers the datasets utilized in prior studies. As a consequence, it was determined that the selected research utilized five distinct types of data sets. In the ensemble approach, a combination of several ensemble (hybrid) techniques is used to achieve better performance in the study. Among them are: In reviewing investigations for security considerations, it was determined that two research publications used combined ensemble methods or strategies to achieve a decent outcome. In addition, there is one research paper that utilized the stacking with boosting strategy and one paper that used the bagging with the voting approach. The authors [73] offered strategies for detecting malicious entities that employ versions of RF, SVM, LR, and ensemble methods with stacking and boosting (AdaBoost Classifier). With an average F1 score of 0.996, the study's findings demonstrate that the ensemble technique yields effective outcomes. This study's strategy is to establish a framework for identifying entities that potentially do harm to blockchain networks.
The conventional Exploratory Data Analysis (EDA) methodology is implemented via data collection, feature extraction, model training, model testing, and final outcomes evaluation to achieve this objective. The study's results also demonstrated that feature extraction is an effective strategy for achieving positive outcomes.The research on under-priced DoS assaults was proposed by the authors [99]. In this study, the simulation method is implemented on the transaction using several input features, namely pending time, value, gas price, and gas. Several ML models were used in this study, such as NB, SVM, KNN, RF, and DT. While the voting technique, which consists of two criteria, namely majority vote (hard) and average confidence (soft), is practiced. This study concluded that the experimental results had shown good performance in detecting under-priced DoS attacks. Conventional UAVs generally depend upon the centralized server to execute data processing with complicated ML techniques. In reality, all classic cyberattacks are relevant to data transmission and storage in UAVs. In this regard, [75] proposes to boost the performance of UAVs with a decentralized ML architecture based on Blockchain. In general, UAV or drone technology uses centralized data processing technology. Unlike a decentralized Blockchain, it is vulnerable to cyberattacks on storage and transactions. Thus, [75] has studied this matter by providing added value using the ML method in Blockchain applications to generate prediction analysis and improve UAV performance. This study also aims to prove that the centralized ML model approach has improved resource utilization and overhead performance. Following this, the decentralization of the ML model is a wise move to produce high-quality forecasting. Therefore, this study conducted two experiments using stacking techniques and without stacking. This study found that using PoC stacking has made forecasting analysis more accurate. Fig. 16. Classification of the Security Aspect.

C. An Anomaly in the Aspect of Information Processing
Information processing is capturing, recording, organizing, retrieving, displaying, and disseminating information. The word has often been applied to computer-based activities in recent years. In this part, we identified 31 papers that apply the information processing characteristics in the selected publications. The list of these applications shows in Table III.  According to Table III and Fig. 17, information processing components are primarily categorized as Blockchain simulator, performance testing, network traffic, social media, data analysis, address identification, performance testing, transaction clustering and behavioural pattern  As indicated in Table III, there are three research articles, and the most commonly employed ensemble model (based learner) in the selected research papers was an RF. In addition, one research paper utilized bagging as an ensemble approach, one research paper adopted the stacking method, and one research paper applied to boosting method. Furthermore, we uncovered three scientific publications that have been utilized in Bitcoin. Finally, note that the development of the ML model depends on dataset input. Thus, this analysis has looked at three different types of data sources used in selected studies. In this study, the authors in [102] employs cascading ML principles-a sort of ensemble learning employing stacking techniques. This study's simulations utilized weak classifiers, GB and RF. As a result, the ensemble stacking method yielded effective classification outcomes based on F1-score, recall, and accuracy values.
The voting-based method developed by the authors [103] aims to improve the level of tracking of Bitcoin performance by labeling addresses controlled by the same user. This study uses Bitcoin datasets taken from previous study publications [104,81] and WalletExplorer. Through simulations on Bitcoin addresses of 200K, we found that the voting method produces better results than the non-voting method in terms of F1 score, recall, and precision. Labeling using supervised learning methods was used to develop a model classification for detecting anomalies in Bitcoin addresses [64]. Therefore, this experiment was conducted using eight main classifiers, namely LightGBM, XGBoost, NN, AdaBoost, RF, SVM, Perceptron, and LR. The experiment showed that the LightGBM classifier produced the best results with a micro/macro score value of F1 (97 percent/86 percent).

D. An Anomaly in the Aspect of Smart Devices
Smart devices are generally IoT gadgets with support for Internet connectivity. They can interact with other devices over the Internet and offer remote access to a user for operating the device as per their needs. In this section, we selected three papers exploring smart device applications. The list of these applications is shown in Table IV. According to  Table IV, smart device characteristics are largely grouped, as illustrated in Fig. 18. Table IV, there are two research articles, and the most often employed ensemble model (based learner) in the selected research papers was XGBoost and Adaboost. In addition, two research publications utilized boosting. Furthermore, we located 1 research paper used in the Blockchain-based Blockchain simulator. From the perspective of datasets, the study has identified four distinct dataset categories used in the selected studies. This is because the ML model to be constructed is dependent on the dataset used.

As indicated in
The authors in [61] describe the design and architecture of our Blockchain simulator, BlockEval, which simulates the behaviour of concurrent activities in a real-life Blockchain system. This research confirmed the correctness of our simulator by comparing it with an independent model constructed using genuine Bitcoin transaction data. XGBoost is a non-parametric supervised LA used for classification and regression. The goal value is anticipated by learning simple decision rules inferred from data attributes. Simulation results have been drawn up to 2000 nodes, which have been checked against actual Bitcoin data. However, there is a scope of enhancement to both the simulator and the validation architecture. For instance, adding propagation latency data with a suitable variance will increase the accuracy of simulation findings. IoT-related research has been undertaken (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 8, 2022 417 | P a g e www.ijacsa.thesai.org by [42], concentrating on data integrity and security. An important thing to perform is to discover irregularities in data transactions using ML approaches. Hence, the IoTID20 dataset, consisting of 80 characteristics (62578 records), was utilized for training the model to be constructed. This study was conducted by taking 15 traits designated as normal and abnormal. During this investigation, different model classifications were trained based on measurement parameters such as F1 score, recall, precision, and accuracy. The experimental results reveal that the AdaBoost and RF algorithms provide similar results and are among the highest classifiers with good performance.   Tables I to IV, several studies have been conducted and published since the creation and application of Machine Learning (ML) algorithms in blockchain networks. In this investigation, the researchers' implementation of the ensemble method has demonstrated an improvement pattern. The ensemble strategy is based on combining multiple individual models to generate a model with superior performance compared to a poor classifier. As a result, researchers are continually on the lookout for procedures or processes that provide better results over time than present approaches. Consequently, the strategy of merging multiple ensemble algorithms can give superior results compared to the use of individual ensemble algorithms. Combining stacking and boosting (stacking and boosting) can improve performance, for instance.
According to Fig. 19, 51  Analyzing ensemble learning research in cybercrime, security, smart devices, and information processing employing an ensemble approach with distinct techniques (e.g., voting, averaging, stacking, bagging and boosting) for anomaly detection is in blockchain networks. Moreover, we found research in the cybercrime aspect (16 research articles) as the most popular for anomaly identification in the blockchain network. On the other hand, five research publications focused on security aspects, while three research papers focused on information processing. Furthermore, one study paper was applied to the smart device's aspect. Fig. 21 indicates the fast-increasing tendency of adopting bagging methods in the last four years (from 2017 to 2020) and shows a declining trend in 2021. On the other hand, the research publications utilizing the boosting method show growth from 2017 to 2021. Apart from that, 31 distinct datasets utilized in the experiments of connected papers were found. As depicted in Fig. 22, most experiments utilize realtime datasets retrieved using the Etherscan Application Programme Interface (API).     There are also some prospective challenges in this domain. In addition to analyzing prior studies, several upcoming studies can be highlighted and enhanced. Among these are studies that do not employ feature selection, which has been demonstrated in several prior studies to increase the performance of outcomes. In addition, the majority of studies utilize obsolete data sets. Therefore, it is recommended that researchers regularly update data. This is because scams and cyber assaults contain crucial data in datasets that must be analyzed to develop better trials. This is supported by [106], who concluded that outdated data usage contributed to the efficacy of drop-in attack detection. Furthermore, the authors in [107] concur that researchers should utilize current databases for their studies.
Exploration of new technologies like ML Designer and AutoML affords researchers the option to undertake research. In the study, adapting the strategy of applying feature selection also yielded positive results. This research [74] compared the detection of anomalies using feature selection against those without feature selection. Using synthetic data sources is another way that can aid in the production of more precise research. For example, this strategy was utilized by [108] in employing synthetic credit card data to detect credit card fraud. Additionally, the authors [99] utilized artificial data to imitate network assault activities. Researchers should also look into techniques to automate various preprocessing stages [109], as well as expand and enlarge datasets [110]. In addition, more dedicated preprocessing steps should be adopted for more specific challenges to improve the result of the Ssoft-TeC and give a more appropriate based learner for the co-training scheme [111].

VI. CONCLUSION
This paper examines the understanding of Blockchain Technology (BT), Blockchain and Machine Learning (ML) integration. It examines previous research on the usage of ensemble approaches as a means of anomaly identification. This investigation demonstrates that assembling strategies can enhance performance and results. The merging of numerous weak models facilitates their unification, resulting in the creation of stronger models. Nevertheless, a mix of ensemble techniques (such as stacking and bagging) can also generate more accurate findings, as demonstrated by several earlier researches.
As demonstrated in Tables I to IV, bagging and boosting are two approaches utilized regularly in the studies over these five years (2017-2020). Nonetheless, we can note that these two strategies are delivering the greatest outcomes largely among research released in 2019 and 2020. In the past two years, we also observed a new trend toward the use of the boosting method. Moreover, from the model employed in the ensemble learning approach, Random Forest (RF) dominated from 2017 to 2020. In 2021, this model declined, whereas Extreme Gradient Boosting (XGBoost) exhibited a growing tendency from 2017 to 2021.

ACKNOWLEDGMENT
This research was conducted to fulfil the requirements for a PhD and with the support of RMIC (UniSZA).