A Comparative Analysis of Scalability Issues within Blockchain-based Solutions in the Internet of Things

Recently, enormous interest has been shown by both academia and industry around concepts and techniques related to connecting heterogeneous IoT devices. It is now considered a rapidly evolving technology with billions of IoT devices expected to be deployed in the upcoming years around the globe. These devices must be maintained, managed, traced, and secured in a timely and flexible manner. Previously, the centralized approaches constituted mainstream solutions to handle the ever-increasing number of connected IoT devices. However, these approaches may be inadequate to handle devices at a massive scale. Blockchain as a distributed approach that presents a promising solution to tackle the concerns of IoT devices connectivity. However, current Blockchain platforms face several scalability issues to accommodate diverse IoT devices without losing efficiency. This paper performs a comprehensive analysis of the recent blockchain-based scalability solutions applied to the Internet of Things domain. We propose an evaluation framework of scalability in IoT environments, encompassing critical criteria like throughput, latency, and block size. Moreover, we conduct an assessment of the notable scalability solutions and conclude the results by highlighting six overarching scalability issues of blockchain-based solutions in IoT that ought to be resolved by the industry and research community. Keywords—Blockchain; IoT; scalability; issues; distributed ledger; throughput; latency


I. INTRODUCTION
The Internet of things (IoT) based solutions have evolved to cover every aspect of our daily lives. IoT technology has been deployed in various environments, including smart homes, healthcare, industrial etc. [1] [2]. It is a collection of smart devices that are connected like a swarm of heterogeneous nodes. For decades, the centralized approach has been recognized as a widespread solution for such environments. However, the rapid increase in these nodes made it impractical to manage and maintain with the traditional centralized approach due to various scalability and speed challenges.
A decentralized approach seems to be a preferable candidate to address challenges within such complexed environments. It will assist in solving many challenges attached to IoT environments while reducing the significant costs related to the previously adopted centralized approach [80]. Blockchain technology is one of the most known decentralized approaches deployed to resolve concerns related to IoT devices [3]. It has demonstrated its efficiency and performance in the financial domain with applications, such as Bitcoin and Ethereum [4] [5]. Blockchain is capable of keeping immutable records of every data generated and exchanged by IoT devices. Therefore, it can present a perfect solution in the following aspects:  IoT environments need a layer to facilitate the interoperability of heterogeneous IoT devices. Blockchain can provide a composite layer above the peer-to-peer network with standard access for every IoT device.
 IoT environments require a tier to support the traceability of data among these IoT devices. Blockchain works as an immutable distributed ledger with a historic timestamp to ensure this feature for IoT devices.
 IoT environments are expected to provide security measures and improve trust aspects by deploying smart contacts and digital signatures.
While the deployment of blockchain technology in IoTbased environments offers various advantages, they still pose overarching scalability concerns due to the vast amount of data generated and the enormous number of IoT devices. Traditional Blockchain platforms have inherited by design a challenge in their limited throughput. Throughput is determined by the number of transactions that can be appended and mined in the blockchain platform [77]. Various known blockchain platforms have different scalability rates, which is insufficient to handle the IoT environments [76] [78]. For instance, Bitcoin has a limited number of transactions in a short period. The bitcoin network blocks are fixed in terms of size and frequency, which causes a scalability issue. The Bitcoin platform has even a lower throughput than Ethereum and other confidentiality issues [8]. However, the Ethereum platform is regarded to have a low throughput when deployed in IoT environments [6] [7].
Researchers have carefully identified the so called scalability trilemma within the Blockchain environment [17], as depicted in Fig. 1. The concept, which Vitalik Buterin first coined, identifies the difficulty of finding a balance between three blockchain properties: decentralization, security, and scalability simultaneously [18]. Scalability trilemma means we can only achieve two out of the three properties at the same time. Furthermore, the scalability issue has some implications www.ijacsa.thesai.org related to the cost of the blockchain database. Practically, all transactions must be stored within a chain, so the chain size will increase as we append more transactions to the chain. This can increase the size of the chain, and maintaining and managing the chain becomes more difficult with time. Currently, the blockchain size of Bitcoin and Ethereum are 354.419 GB and 870.37 GB, respectively [16] [17]. Other blockchain platforms have been designed with high throughputs, such as IOTA, a commercial platform designed to be deployed in the IoT environment. However, it is regarded to have a long delay when addressing a massive amount of data [9]. Hyperledger Fabric and Ripple are two blockchain platforms that got high throughput [10][11]. Nevertheless, they suffer from the same issue of limited scalability, especially in terms of validating the nodes [12]. The following section will explain many solutions to tackle blockchain scalability issues.
In summary, we can summarize the contributions of our research as follows:  Contribution one (theoretical): establish a fundamental understanding of the major scalability solutions using Blockchain in the IoT domain.
 Contribution two (theoretical): devise an evaluation framework for assessing the effectiveness of the current scalability solutions.
 Contribution three (empirical): evaluate existing scalability solutions with a focus on their strengths.
The remainder of this paper is divided into five sections. Section two reviews the Blockchain and IoT technologies. Section three compares various research scalability solutions that operate in different IoT layers. Section four proposes an evaluation framework and compares the Blockchain-based scalability solutions. Section five summarizes the key findings of our research.

A. Blockchain Technology
Blockchain, which is a distributed public ledger technology, was initially developed for cryptocurrencies such as Bitcoin. The concept of Blockchain was initially introduced by Nakamato [4] in 2008 but did not receive much attention initially. With the emergence of IoT in the past few years, Blockchain has started gaining the attention of researchers as a P2P technology for distributed and decentralized computation and data sharing. Blockchain can avert the possibility of intrusions by adopting cryptographic techniques in the absence of a centralized control environment. Interestingly, its unique security features, like transactional privacy, data immutability, authorization and integrity, fault tolerance, and transparency, allow Blockchain to be utilized in areas beyond cryptocurrency.
Blockchain technology has evolved around the idea that a single block, the fundamental component of Blockchain, stores certain types of information. The block is linked to similar blocks to form a chain where each block is associated with the previous block through a hash, as depicted in Fig. 2. The integrity of each block is assured by a hash function which is deployed to create a hash value of each block. The hash value is a digital fingerprint, which can be transformed to a different digital fingerprint by making minimal changes to the block, such as switching a bit value [52]. The hash value is the entity responsible for connecting every block with the previous block since each block possesses the block hash value behind it. Validation of the integrity by the system can easily be performed by running the hash function on every single block and then comparing the result with its prospective digital fingerprint. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 9, 2021 482 | P a g e www.ijacsa.thesai.org Blockchain technology is a decentralized ledger where each block is created and broadcasted to the connected peers. Therefore, each peer is guaranteed to have the identical most recent copy of the ledger. Thus, the forgery of a blockchain practically becomes very difficult. A blockchain environment has various characteristics, including decentralization, Tamperproof, trustless, and anonymity.
 Decentralization: Blockchain is built around the idea of a distributed ledger with no central entity that controls the network. It means that the system is robust against a single point of failure. Therefore, if one node goes down, the system still functions properly.
 Tamper-Proof: The only way to take over the network is by launching a theoretical 51% attack [51]. In order to change the block content and make the validation process faster in comparison to all other peers within the network, the attacker roughly requires more than half of the computational power on Blockchain.
 Trustless: The blockchain environment depends on complete transparency. Thus, parties on the chain can trust each other.
 Anonymity: As mentioned above, there is no need for trust in the blockchain environment. Thus, parties on the chain remain anonymous with no need to reveal any party identity.
Furthermore, Blockchain can be arranged into three categories based on the participants' respective environment [27]. The categories can be summarized as follows: 1) Public blockchains: It is a permissionless blockchain that runs on a public network in a decentralized and distributed fashion. The environment is open, and any node can participate without any authorization [50].
2) Private blockchains: It is a permissioned blockchain that runs within a private network within an organization that governs and regulates all transactions.
3) Consortium blockchains: It is also a permissioned blockchain. However, it is initiated and controlled by related entities. A node must register ahead of their participation; then, they must adhere to rules and regulations. Table I summarizes the key differences between the three blockchain categories.

B. Internet of Things (IoTs) Technology
Recently, the Internet of Things unleashed its power to deliver services across various domains from small businesses and social media to smart houses, smart cities, and industries. IoT connects resource-constrained heterogeneous devices with a broad range of functionalities in human and machine-centric communication networks. IoT has positively met the everevolving requirements of the above-mentioned sectors. However, the significant escalate in the number of such resource-constrained IoT devices and the massive information generated from them becomes a hurdle towards meeting the efficiency and security requirements. The Internet of Things (IoT) is a network that attaches different devices to receive and transmit data over the Internet. The data is generated using various smart applications running on smart devices and sensors known as IoT devices. An estimated 50 billion IoT devices will be attached to the Internet worldwide in 2023 [49]. In Information Technology, IoT is undoubtedly a significant development connecting almost everything to the world wide web. Over the last few years, the increasing data rates and advancement in IoT paved the way for various concerns, with scalability being at the top of the list.
The IoT network consists of three layers, namely perception, communication, and industrial layer, as shown in Fig. 3. These sections can be briefly described as follows: 1) Perception layer: There are various IoT devices within this layer. These devices differ in function, which can include sensors, controllers, smart meters, etc. The primary function of these devices is sensing and collecting data from the physical environment. However, it might also react to actions in the physical environment.
2) Communication layer: There are several wireless/wired devices within this layer. These devices can be IoT gateway, Wi-Fi Access points, or small base stations. These devices deploy various communication protocols include Bluetooth, Near Field communication, etc. The primary function of these devices is to transfer data from the perception section to the industrial section.
3) Industrial layer: The industrial layer incorporates manufacturing, Airports, banks, supply chain etc. The decisions in these industrial organizations are build on the data gathered from the perception layer.
Previously, the centralized approach was the mainstream solution for handling complex structures of connected heterogeneous IoT devices. It was based on a traditional client server approach over the Internet. However, it suffered various challenges and is judged inadequate to handle data at this massive scale [80]. www.ijacsa.thesai.org

III. ANALYTICAL COMPARISON OF SCALABILITY SOLUTIONS
Due to the unraveled interest in deploying blockchain platforms in IoT systems, different approaches have been adopted to upgrade blockchain scalability. As mentioned, the challenge of enhancing blockchain scalability intensifies when more IoT devices/nodes are connected to each other and produce transactions at a higher rate. We identify and analyze the approaches published in recent literature to tackle the scalability issues. These approaches have been deployed in different layers of Blockchain and thereby can be classified as follows:  Layer Zero "Approaches with the dissemination of Information": These proposed solutions focus on customizing the propagation protocol of information.
 Layer One "Approaches within the Blockchain": These proposed solutions focus on tackling the problem by changing the structure of blocks and consensus algorithms.
 Layer Two "Approaches off the Blockchain": These proposed solutions tackle the problem by executing some complex computational tasks off the Blockchain platform.

A. Layer Zero: Approaches with Propagation Protocol
Approaches dealing with the propagation protocol were classified recently by some researchers as a possible solution for scalability issues within Blockchain. Parties exchange and broadcast blocks of data/transactions inefficiently within the blockchain network, causing a high confirmation time. Enhancing and optimizing data transmission can result in improved throughput. Many studies have been published in layer zero, which can be explained in Table II.

B. Layer One: Approaches within the Blockchain
These proposed solutions on this approach focus on tackling the problem of scalability by different strategies, which can be viewed as follows:  Redesign the structure of blocks.
 Implementing the DAG (Directed Acyclic Graph).
 Applying different consensus algorithms.

Approac h Name How it Works Advantages
BloXrou te [70] The design of the network is based on increasing the block size while decreasing the interval between blocks.
Velocity [71] The structure of the protocol ensures an enhanced block propagation by deploying erasure code (+) Increases throughput Kadcast [72] It is based on Kademlia Architecture, where it works similarly to the mechanism deployed for enhanced broadcasting with adjustable redundancy and overhead.
(+) Enables fast propagation (+) Enables secure transmission Erlay [73] The protocol improves the network connectivity while keeping the cost at a minimum level.
(+) Affordable cost (+) Private transmission 1) Redesigning the structure of blocks: The simplest approach to tackle the scalability concerns of Blockchain is redesigning the structure of blocks by increasing the block size. Practically, all transactions are appended within blocks in any blockchain platform. Since more transactions are recorded within a particular block, the throughput of transactions per block interval would consequently increase [13]. However, deploying such a simple approach comes with other direct and indirect challenges. One of these challenges is increasing the probability of hard forks in the blockchain platform. Consequently, a split of nodes within the Blockchain would happen as it happened in Bitcoin [14].
Traditionally, the Blockchain platform requires each node to record the complete history of all transactions to become a part of the network. An increase in block size means that each node must increase its storage requirements, making them more expensive to execute. Nodes that are not capable of securing such storage requirements would eventually be ruled out of the blockchain platform. As a consequence, a lesser number of centralized nodes would take control of the Blockchain. It leads Blockchain to lose its decentralized nature, so end users must have more trust in the protocol [15].
Redesigning the structured approach includes other techniques such a block compression. It can enhance the throughput of the Blockchain platform, where it reduces some unessential and redundant data of a block [22]. Compact block relay was designed and deployed according to the block compression approach [22]. It is based on changing the data structure of the original Bitcoin blocks along with shortening the transaction header data. Txilm is a technique based on the same concept of compression of blocks [22]. However, these kinds of techniques are prone to hash collisions.

2) Implementing Directed Acyclic Graph (DAG):
The blockchain structure records transactions in chains that are arranged in a sole chain formation. Due to this type of liner formation, blocks are created one at a time with no concurrent operations. Consequently, Blockchain has a limited throughput with high latency. Allowing a concurrent operation would enhance throughput, so a new idea of blockchain structure build on DAG (Directed Acyclic Graph) is proposed [23]. www.ijacsa.thesai.org The directed acyclic graph is a finite graph commonly deployed in a computer science major. The DAG-based blockchain Blockchain considers a block as a vertex in the DAG attached to other previous vertices. Moreover, The DAGbased Blockchain permits many vertices to be attached to a preceding vertex that creates simultaneous blocks. The IOTA foundation has designed its IoT-based Blockchain in the above-mentioned technique to address the scalability issues of Blockchain [28].

3) Deploying sharding techniques:
The sharding technique was first developed within the database management field as an attempt to optimize large databases. It is based on partitioning a database into several physical fragments, where each fragment saves its distinct subset of the data. This divide of a large group across multiple servers permits the distributed management of operations of a single database, thus improving scalability [31]. Practically, it applies the concept of divide-and-conquer on the blockchain platform, so each platform will be divided into several smaller units called a shard. Fig. 4 shows the concept of the sharding technique on the blockchain platform. A pool of transaction is processed in multiple shards, that reduces the load on each node and makes it possible for nodes to process a small number of transactions. Recently, several studies have been published to tackle the scalability issues using the sharding technique to improve transaction throughput.

4) Applying different consensus algorithms: Various consensus algorithms have been used in different types of
Blockchains. These consensus algorithms are used, so Blockchain becomes more resilient to malicious participants and message delays. Several algorithms are deployed in the research literature to solve security issues. However, each one of them has an overhead that affects blockchain throughput and scalability. Therefore, some optimizations are required to enhance the scalability of Blockchain. The essential consensus algorithms are as follows.
Proof of Work (PoW): To add blocks to the Blockchain, each node must perform some exclusive work known as Proofof-Work (PoW) [36]. In Bitcoin, each node must compute a hash value less than a specific number, which is also known as the difficulty level set by the Blockchain. The difficulty level is changed periodically by the Bitcoin protocol, where it takes between five to ten minutes to produce a single block [36]. The procedure of finding a solution to the PoW puzzle (i.e., to find a winning hash value) is also called mining. Speed is critical in the the operation, so the mining prize is given to the first node that computes a winning hash. Furthermore, the node gets to include its proposed block in the Blockchain. Once a node finds a winning hash and broadcasts it to others. Next, other nodes have to confirm that the proposed hash value is correct and valid [37]. Since several nodes are computing the winning hash simultaneously, there is a possibility that several nodes compute the winning hash at the same time. Sequentially, each wining node includes its block, the Blockchain announces it over the peer-to-peer network. In such a scenario, there are temporary forks in the Blockchain due to some nodes including their block into the first branch of the Blockchain and others include in the second branch and so on. To fix this problem, the protocol will choose the longest branch and delete the other branches [36]. Due to the previous challenges in the original PoW algorithm, many optimization techniques were proposed to enhance the algorithm scalability [38][39] [40].
Proof of Stake (PoS): It is deployed to avoid the PoW algorithm weaknesses. It replaces the mining process with an alternative idea where users can own a virtual currency in the blockchain platform. Practically, users can buy any amount of cryptocurrency and then utilize it in the form of the stake to purchase equivalent block creation chances in the blockchain platform by working as a validator. The validator cannot predict its turn ahead of time since the algorithm randomly chooses the validator node to create the block. At its original form, the algorithm has a problem called Nothing-at-Stake, where the algorithm does not provide incentives for nodes to vote for the accurate block. Nodes might vote for blocks supporting several forks and branches to maximize their chances of winning a reward as they do not consume anything from their resources [36]. There are other problems with the PoS where it assumes that the chances of an attack on the blockchain by the nodes having a higher amount of currencies are minimal. [37]. Therefore, several alternative solutions were proposed where [41] deploys randomization techniques to forecast the next validator. It utilizes a mechanism that finds the lowest hash number in combination with the length of the stake. Peercoin [42] selection is based on coin age-based selection, where older coins have a greater possibility of mining the next block. However, Ethereum is trying to switch from Ethash [43] to Casper [44].

C. Layer Two: Approaches off the Blockchain
The proposed solutions in this approach focus on tackling scalability by executing some complicated computational work outside the Blockchain platform. These solutions apply different strategies, including payment channel, sidechain, offchain computation, and cross-chain techniques. Below we provide an analysis of each approach.

1) Payment channels:
The strategy of the payment channel is based on creating a temporary off-chain channel where some transactions can be executed off-chain so to reduce the volume on the main network and increase the transaction throughput of the whole Blockchain. Example approaches that employ payment techniques are described in Table III.  Trading between two parties (recorded off the chain).
 Closing the channel where the number of tokens of both parties is recorded on the main chain.
2) Sidechain techniques: The Sidechain technique was first proposed at Pagged Sidechain [61]. Generally, it allows the assets in a specific blockchain to be moved between various sub-blockchains. It guarantees assets to be secure and saved. Several key sidechain algorithms are described in Table IV.   TABLE III. COMPARISON OF SCALABILITY SOLUTIONS USING PAYMENT CHANNEL TECHNIQUES

Approach Name How it Works Advantages
Lightning Network [45] Uses two parties of Blockchain to establish their own off-chain private trading channel. The channel is dedicated to several low latency transactions.
(+) provides private communication The technique is payment-based. The Raiden Network is deployed on the Ethereum network with support for all ERC20 [47].
(+) enables secure communication  The protocol permits a parent chain to create smaller copies as child chains. The created copy of a parent chain is designed and developed according to a specified use case. The parent chain delegates its work to child chains.
(+) improves the transaction throughput (+) delegates work to child chains Pegged Side Chain [48] The approach is based on a two-way peg, transferring the assets from the main chain to a child chain. It ensures that these assets are securely sent from the parent to a child by locking them until the pegged side chain obtains a simplified Payment Verification (SPV) proof. A confirmation period is enforced for security reasons.
The newly transferred assets are halted on the sidechain to keep away from doublespending issues. The exact logic is applied once transferring the assets back to the main chain.
(+) provides secure communication The network is based on a data architecture named Merkleized Interval Tree. It is formed of a multi-layered tree which is deployed on NOCUST.
It allows the party's' balances to be saved on private non-crossing interval space. Practically, all balances are verified against the amount registered in the smart contract on the main network.
(+) ensures the correctness of computations

3) Off-Chain computation:
In Ethereum, miners must execute all contracts to validate their states. The operation is known to be costly and time-consuming. Therefore, many techniques help to build a scalable platform. Table V lists  example off-chain computation techniques.   TABLE V. COMPARISON OF SCALABILITY SOLUTIONS USING OFF-CHAIN TECHNIQUES

Approach Name How it Works Advantages
Truebit [63] It is designed based on outsourcing computations to trusted third parties known as solvers and challengers.
Tokens are deposited to the smart contract by the solver. The challenger verifies the work done by the solver and gets compensation for its work.
Arbitrum [64] Enables nodes to deploy smart contracts as virtual machines that include all rules of a contract. It has four types of roles: -Verifier: it acts as a global entity to validate transactions and publish accepted transactions. Key: it is a participant entity that can own currency and propose transactions. Virtual Machine: it is a virtual participant in the protocol which can own currency and exchange them. Manager: it manages the virtual machine and makes sure its correctness.
(+) enhances blockchain scalability www.ijacsa.thesai.org 4) Cross-Chain techniques: Cross-chain techniques are considered to be potential solutions to improve scalability in Blockchain. Generally, these techniques are based on the interoperability among several separated chains. Therefore, the inter-connection between these chains can result in enhancing scalability. Fig. 6 depicts an example of the main cross-chain techniques. There are two main cross-chain algorithms which are listed in Table VI.   [79]. Blockchain is considered a network that can be measured by standard performance metrics like throughput and latency [77].
Since we are talking about Blockchain, throughput can be clearly associated with the number of committed valid transactions within the Blockchain per second [77]. Therefore, we can represent the throughput as follows: Transaction Throughput=Number of Committed Transaction /Time in Seconds (1) Latency is also associated with transaction latency which is defined as the proportion of the Blockchain to commit a transaction [77]. Therefore, we can represent the latency as follows: Both performance metrics, throughput and latency, are closely related to the block size. Various blockchain networks suffer from issues about standing by for transactions to be committed within the block due to the fixed size of blocks [78]. Therefore, it is a critical parameter that must be included in blockchain network evaluations. Furthermore, Consensus algorithms and applied techniques are closely related to our scalability analysis, so we added them in our evaluation, as depicted in Fig. 7.

A. Comparison of the Scalability Blockchain-based Architectures
As mentioned above, this paper's main contribution is to analyze each Blockchain architecture and its main characteristics affecting IoT scalability. Our scalability evaluation framework incorporates various dimensions. Our selected dimensions include 1) throughput, 2) storage (block size), 3) latency, 4) deployed techniques, and 5) consensus algorithm. We will base our comparative evaluation on these criteria. Table VII details the findings of the comparison.

B. Summary of Scalability Issues
Our detailed analysis of state-of-the-art architectures aimed at resolving scalability challenges pertaining to blockchain solutions that could enhance the IoT domain. The significant challenges are summarized below.  Challenge One: scalability is closely related to block the size. If the block size exceeds the network capacity, the block will not be attached to the chain. As a consequence, some solutions strive to increase the block size.
 Challenge Two: although increasing the block size enhances performance, it may increase the probability of blockchain forks. Therefore, other solutions enforce mechanisms to prevent the occurrence of forks.  Challenge Six: scalability solutions are deployed outside the blockchain environments by outsourcing computationally intensive operations to a third party so the main chain can execute other light operations simultaneously. Accomplishing parallel execution of transactions enhance the prospect of scalability. However, the appeal of blockchain comes from the fact that we do not have to rely on third parties. By outsourcing the operations, we surrender an advantage and restrict the environment. Furthermore, concerns about third party's trustworthiness, security and privacy need to be resolved. Enormous efforts have been made towards solving the scalability issues within Blockchain to adapt this promising solution to connecting heterogeneous IoT devices. In this paper, various scalability solutions were presented and compared according to their layer within the blockchain network. Next, the paper evaluated these solutions according to standard performance indicators such as throughput, latency, and storage. The paper attempted to summarize the existing blockchain solutions at different layers so to serve as a roadmap for more improvements by other researchers. In the future, we plan to extend our comparative analysis to investigate other issues impacting the blockchain-based networks, particularly those associated with security aspects.