An In-depth Analysis of Uneven Clustering Techniques in Wireless Sensor Networks

—The low-cost and convenient feature of Wireless Sensor Networks (WSNs) has made them popular in many sectors over the last decade. The WSNs are now widely used as a result of recent advancements in low-power communication and being energy-efficient. The WSNs typically use batteries to power sensor nodes. The finite stored energy in batteries and the hassle of battery replacement have led to a critical focus on energy efficiency for WSNs. Clustering and data aggregation are the most efficient methods to address the energy concerns of WSNs. This paper comprehensively reviews several uneven clustering methods and compares the various uneven clustering algorithms. The methods are described in terms of their goals, attributes, categories, advantages and disadvantages. Probabilistic clustering is used when there is a need of simplicity and speed. As a result, this study compared all these types of protocols based on their clustering properties, CHs properties, and on the type of clustering process; and current research gap effective techniques are also addressed.


INTRODUCTION
Wireless Sensor Networks (WSNs) can be distinguished from other wireless ad-hoc networks by their unique characteristics. WSN nodes are less powerful devices that have less memory and processing power [1,2]. Recent advances in CMOS and nanotechnology have made sensor nodes smaller, cheaper, and more efficient computationally [3]. Sensor nodes are clustered together with microprocessors, batteries, transceivers, and sensors. Numerous sensor nodes can be connected in a network without requiring any predefined infrastructure, and the sensor nodes can configure themselves [4]. Sensor nodes in the WSN detect physical variables such as humidity, temperature, and pressure in the air. Furthermore, it can detect acoustic signals, infrared, and vehicle movements [5]. The micro-sized processing unit processes the value sensed by these sensors and forwards it to the base station with the help of a communication unit through one-hop or multi-hop transmission [6].
Many applications use WSNs to monitor and track, including surveillance in the military, healthcare, industrial automation, agriculture, inventory control, disaster management, etc. WSNs provide connectivity for remote regions, such as deserts, forests, deep seas, battlefields, and complex industrial and research facilities [7]. Sensor nodes are deployed in remote provinces with limited electricity for charging and other replacement options. The initial phase of transmitting data from the source nodes to the base station is more costly than sensing data in WSNs [8]. Therefore, an energy-efficient communication strategy is crucial in transmitting data from the sensor nodes to the base station, which will prolong the lifetime of the WSN. These problems can be resolved with the help of a well-known method called clustering [9].
Using the clustering strategy, different nodes are grouped into heterogeneous and homogeneous clusters. In general, clustering involves designating a single node as the CH and segmenting a given area into discrete sectors. In order to create an energy-efficient WSN model, CH is essential. To enhance network performance, the CH may be altered in practice during some iterations [62]. Furthermore, unique specialized nodes, such as CHs, make all significant choices on behalf of SNs with the aid of clusters. Clustering can be categorized as either single-hop or multi-hop. There are many drawbacks and benefits to single-to-multi hop communications, such as the energy losses associated with single-hop transmission due to greater range. In the meantime, the multi-hop technique can boost energy consumption to solve the wireless sensor network's scalability issue, thereby boosting both energy consumption and scalability [61,63]. There are two types of sensors in these clusters: normal nodes and Cluster Head (CH) nodes [10]. Fig. 1 depicts the clustering architecture of WSNs. Communication in WSNs is divided into two types: intracluster and inter-cluster. Intra-cluster communication occurs between cluster members, whereas inter-cluster communication occurs between clusters. The cluster member nodes sense real-world parameters, and the measured values are transmitted to the CH. The CHs combine the values received to eliminate redundant information present due to the same values sensed by some nodes. The final data is transmitted directly to the base station through intermediary CHs [11]. Cluster members can send data to CHs only, and CHs can send data to the base station, eliminating the need to transmit redundant data. Clustered WSNs have many advantages over conventional WSNs, including better utilization of bandwidth, less overhead, enhanced link connectivity, efficient topology balancing, stability in network topology, and reduced routing table size [12].
In the last few decades, a variety of even and uneven clustering methods have been presented for enhancing energy efficiency in WSNs. The motivation behind this work is the lack of a thorough review of clustering protocols. The main contribution of this study is to do critical analysis and review outcomes of uneven clustering and provide a detailed analysis of each method. Furthermore, this study compares clusters based on a variety of cluster properties, including cluster properties and clustering processes. This paper is organized as www.ijacsa.thesai.org follows. Section II discusses the characteristics and objectives of uneven clustering. The classification and explanation of clustering algorithms are presented in Section III. Section IV categorizes and tabulates algorithms. The paper concludes in Section V.

II. BACKGROUND
In this section, an overview of uneven clustering is presented, as well as a detailed discussion of its objective and characteristics.

A. WSN Applications
The use of WSNs is expanding rapidly in various applications or is still in its infancy. There are four general categories of WSN applications according to their use, including; urban, industrial, environmental, and health. There are various subgroups within each category. Each category and subcategory in this section is described in more detail. Examples of their characteristics are discussed, highlighting their advantages and disadvantages.
 Urban applications: WSNs offer a wide range of sensing capabilities, allowing them to gather extensive information about any target area, indoors, outdoors, or even inside buildings. In an urban environment, WSNs can measure any phenomenon's spatial and temporal characteristics. Monitoring various parameters is crucial for ensuring citizens' optimal well-being in large cities [13]. WSNs provide authorities with real-time data needed for optimal city operation.  [17].
 Environmental applications: WSNs provide environmental information to decision-makers. Environmental monitoring encompasses both natural and man-made environments, indoors and outdoors. There are various methods of monitoring, such as sensor-based monitoring on the ground, field sampling analysis in laboratories, and aerial and satellite remote sensing [18]. Diverse challenges remain for environmental monitoring nodes. Environment-specific WSNs are typically located in remote areas disconnected from the power grid. The primary challenge lies in selecting proper topologies and operating strategies aiming to ensure the nodes are as energy efficient as possible [19].
 Health applications: Having many advantages, like short delay, superior performance, economical cost, etc., WSNs are widely used in the healthcare sector. The healthcare WSN involves the real-time transmission of patient health information through the Internet to health professionals, so patient confidentiality and integrity are crucial. Physiological parameters are detected by placing sensor nodes on patients. This allows the monitoring center to track the patient's vital signs in real-time remotely [20]. The monitoring center receives the information and processes it on time. WSNs allow the collection of www.ijacsa.thesai.org human health data, which is helpful when researching human diseases and conditions. A wide range of applications of WSN can also be found in drug management and drug development, blood management, etc. Future telemedicine monitoring systems can take advantage of WSNs more conveniently and cost-effectively [21].

B. Uneven Clustering
Clustering can be even or uneven. Compared to uneven clusters, even clusters have equal sizes. Network partitioning occurs when CHs near the base station die before they mature. The problem is commonly known as the energy hot-spot issue [22]. A disparity in power consumption results in hot spots in WSNs, decreasing the network's life expectancy [23]. As a solution to hot-spot unevenness, clustering techniques are used for load balancing among the CHs. In uneven clustering, cluster sizes near the base station are smaller than those far from it. This way, uneven clustering solves the load balancing or hot-spot problem in clustering [24]. Fig. 2 illustrates how clusters of uneven type are formed based on competition radius calculations. Eq. (1) gives the formula to calculate the competition radius of nodes that are homogeneous in type and size, where c is the weighted factor ranges between 0 and 1, R 0 c refers to the highest competition radius value, d(s i , Ds) denotes the distance between the node s i and base station, and d min stands for minimum distance between the node and base station.
In a network with nodes whose energy levels differ, i.e., heterogeneous type nodes, the formula for competition radius is given below. In Eq. (2), α and β are weighted factors having a value between 0 and 1, E max refers to the maximum value of residual energy, R max refers to the maximum competition radius value, and E r signifies the remaining energy of node s i . The intra-cluster communication does not require any aggregation of data, but the inter-cluster communication requires aggregation of data by the CHs before transmitting to the next node or directly to the base station. The parameter Erelay represents the energy consumed in the data transfer from node s i to node s j , and its calculation is given in Eq. (3).
This requires that the node S i have to select the next node S j who have the highest remaining energy and minimum E relay in the route candidate node set.

C. Objective
In uneven clustering, the main objective is to avoid hotspots, which arise from unequal energy consumption by the nodes of WSNs, thereby reducing their lifespan. The remaining objectives of uneven clustering are the same as those of even clustering. Additionally, uneven clustering aims to achieve the following.
 Load balancing: Uneven clustering provides load balancing by dividing networks into clusters of variable sizes, with smaller clusters closer to the base station and larger clusters located farther from the base station. Energy can be saved in inter-cluster data transmission, extending the network life [25,26].
 Network lifetime: Uneven clustering has the primary purpose of enhancing the network's lifetime, which is essential for all the real-time applications that require energy conservation among the nodes of the WSNs due to the limited amount of energy available. A good routing algorithm and rotating CPUs can be used to conserve energy with uneven clustering [27,28].
 Data aggregation: To reduce the redundancy caused by the same value sensed by several node members, CHs aggregate the received values. The final data is sent to the base station directly or indirectly through midway CHs, which minimizes the number of messages sent and minimizes the load on the network as a whole. Data fusion or data aggregation refers to a process combining all incoming packets. Data fusion enhances data and reduces unnecessary noise [29].
 Fault-tolerant: Sensor nodes are positioned in remote regions, such as deserts, forests, deep sea, battlegrounds, and complex industrial and research setups, where there may be the possibility of them being damaged or malfunctioning. The applications that may suffer causalities from data loss require fault-tolerant nodes. The self-organized WSNs solve this problem by re-clustering, but re-clustering not only interrupts the current process but also increases overhead [24,30].
 Scalability: Sensor nodes ranging from hundreds to thousands are positioned in real-time according to application requirements. The routing method should be designed to handle these vast numbers of sensor nodes. The transmitting node should know the receiving CH when cluster data is exchanged using cluster heads. Hierarchical clustering allows networks to be www.ijacsa.thesai.org subpartitioned into layers, and each layer can be subpartitioned into clusters that facilitate scalability and minimize routing table size [31].
 Stable network topology: In any type of clustering, sensor nodes are elected as CHs, and CHs are subject to changes in topology at the cluster level. CHs are equipped with all the information about their members' nodes, including their location, energy level, and ID. Hierarchical clustering is effective at the level of managing the topology of the networks. Changes in cluster membership should be reported to the base station, as re-clustering is required to maintain the effectiveness of the topology [32].

D. Uneven Clustering Characteristics
1) Clusters properties: Cluster properties include cluster size, cluster count, and types of communication within and between clusters.
 Size: Clustering can be even or uneven. The even clustering results in the same size of clusters. However, in uneven clustering, the clusters surrounding the base station are smaller, and the others are larger to balance the load. Hence, energy can be conserved in the intercluster data transfer, thereby extending network life [33].
 Count: Depending on the application, clusters may be fixed or variable in size. In some applications, the total number of CHs is 5%, while in others, CHs are selected by random elections [34].
 Inter and intra-cluster communication: Within WSNs, there are two types of communication: inter and intracluster. Within a cluster, intra-cluster communication occurs between members and CHs. Inter-cluster communication takes place between CHs [35].
2) Cluster heads properties: By single or multi-hop communication, CHs collect data from members, aggregate the collected data, and send the aggregated data to the base station.
3) Method of clustering properties: The clustering method has four basic characteristics, which are as follows:  Objective: Clustering serves several objectives, such as energy conservation, fault tolerance, load balancing, and enhancing the lifespan of WSNs [36].
 Methods: Clustering or grouping of nodes can be centralized or distributed. A centralized approach controls cluster formation, selection of CHs, etc., whereas a distributed approach does not have central control [37].
 Nature: Clustering can take a reactive, proactive, or hybrid approach. Continuous transmission of data indicates that nature is proactive. In reactive nature, data is dispatched when sensed data has reached a predefined threshold. The term hybrid refers to the combination of proactive and reactive nature according to the requirements of specific applications [38].
 The base of selecting CH: CHs are selected according to three criteria: attribute-based strategies, probabilistic strategies, and preset. Probabilistic strategies are those in which CHs are chosen randomly without predetermined protocols. Attribute-based strategies determine CHs depending on various indicators such as remaining energy, distance from base stations, the density of nodes, etc. In the present method, the CHs are elected before the nodes or clusters are deployed [39].

III. REVIEW OF UNEVEN CLUSTERING ALGORITHMS
Uneven clustering is intended to prevent the hot-spot problem arising from the unequal distribution of energy consumption by the nodes of WSNs, reducing the network's life span. Uneven clustering accomplishes the same objectives as even clustering. Uneven clustering algorithms are categorized into three categories: probabilistic, deterministic, and preset. Additionally, "static, dynamic, and hybrid" classes are used to categorize the clustering protocols. Clusters in static clustering, once formed, remain the same for the duration of the network. The benefit of this approach is a decrease in clustering overhead. However, the static method might not work properly since some nodes might run out of energy and shut down before other nodes do. Dynamic clustering, in contrast, differs from the static technique in that it will carry out fresh clustering at each instance. Even so, it has a drawback of expensive overhead and won't generate much traffic to put strain on the nodes. The last benefit of hybrid clustering is that it decreases communication overhead while increasing energy efficiency and network longevity [64]. Following sections review the existing uneven clustering methods.

A. Preset Uneven Clustering Approaches
Preset clustering algorithms are used to design WSNs in which the locations of CHs are known in advance. Although WSN topologies change due to node or link failures, preset clustering algorithms ignore these conditions. Therefore, realtime applications cannot be implemented with these algorithms. The proposed approach by Soro and Heinzelman [40] organizes the network hierarchically with Unequal Clustering Sizes (UCS), distributing the energy uniformly among the CHs and extending the lifespan of the network. The positions of each CH are determined in advance and are arranged in a circle around the base station. Each cluster consisted of sensor nodes arranged in a Voronoi pattern around a CH, forming layers around the CH with a particular number of clusters. The Voronoi regions are pie-shaped areas. First layer clusters are equal in size, forming a symmetrical circular association. However, the clusters in the second layer differ in shape and size from the first layer. In every layer, the radius can be adjusted to vary the region that clusters cover, thereby changing the number of nodes within each cluster. According to experimental results, UCS is approximately 10%-30% more efficient than the even-sized cluster scheme that depends on the effectiveness of CH aggregation. UCS can work better for networks that collect a lot of data. This UCS has a long lifespan for heterogeneous and homogeneous networks with a static cluster. The UCS also achieved more even dissipation in www.ijacsa.thesai.org uniform-type networks. A comparison of preset-type uneven clustering methods is provided in Table I.

B. Deterministic Uneven Clustering Approaches
Deterministic clustering algorithms control clusters and select CHs efficiently. In these algorithms, the CHs are selected by considering several factors, including available energy, the proximity of the base station, the density of nodes, etc. These criteria change as neighboring nodes exchange messages. Deterministic clustering algorithms can further be divided into three groups, compound-based, heuristic-based, and fuzzy-based.

1) Compound-based approaches:
In unequal clustering, various metrics, such as linked graphs, the Sierpinski triangle, etc., are used to make the clusters compound-based. Numerous deterministic compound-based approaches have been suggested to minimize the energy consumption of clustering. As a solution to the hot-spot issue in WSNs, Guiloufi, and Nasri [41] proposed Energy Degree Distance Unequal Clustering Algorithm (EDDUCA). The CHs in EDDUCA are chosen based on reserved energy, node density, and distance between nodes and cluster centers. In addition, the size of clusters made by EDDUCA is the same regardless of their distance from each other and the base station. EDDUCA applies the Sierpinski triangle procedure to separate networks into uneven clusters based on their size. In the Sierpinski triangle procedure, n nodes are homogeneously distributed in the square shape zone. Two diagonal lines are drawn in the rectangle-type zone, and four triangles are drawn. Small triangles are then formed by linking the midpoints of the sides of each triangle. The results of the experiments showed that EDDUCA significantly reduced the consumed energy, balanced the consumed energy among clusters, and balanced the consumed energy among sensor nodes, thereby extending the longevity of wireless sensor networks.
Xia, and Zhang [42] have developed the Unequal Clustering Connected Graph Routing Algorithm (UCCGRA) to load balance between CHs of clusters and diminish the energy consumed in inter-cluster communication. Uneven clustering and connected graph theories are the basis of UCCGRA. Firstly, CHs are chosen by a vote to balance energy consumption. Because they are smaller and have more load than CHs, located far from the base station, CHs near the base station can maintain some power for inter-cluster communication. A multi-hop inter-cluster routing scheme is introduced based on the connected graph of CHs and base stations that allow multiple ways to reach the base station from CHs. Results prove that UCCGRA enhances the reliability and flexibility of data communication. It also prolongs the lifespan of a network by evenly distributing the CHs, balancing the scale, reducing energy consumption among CHs, and optimizing energy consumption among CHs.
Guo, Zhang [43] have proposed PEG-ant based on the PEGASIS protocol and used the improved ACO method instead of greedy algorithms. Each neighbor of the current node is considered a candidate, and residual energy, energy consumed, and pheromone quantity are considered factors while selecting the next node. In comparison to the original PEGASIS, PEG-ant achieved global optimization. Through the chain made by the PEG-ant, the path is more evenly distributed, and the total communication distance is also reduced. The chain nodes are balanced based on residual energy. Compared to existing protocols, PEG-ant significantly prolonged the lifespan of WSNs and effectively managed energy consumption.
2) Heuristic-based approaches: The uneven heuristics clustering in WSNs uses optimization algorithms to provide the optimal solution. To achieve the most optimal performance or solution, each algorithm uses a different fitness function. These methods are generally centralized but may be distributed in exceptional cases. There are various Deterministic heuristic-based clustering methods proposed to make this process energy-efficient.
Data aggregation and clustering based on tree structures can grant WSNs a longer lifespan. Kaur and Mahajan [44] offerd a hybrid approach based on Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) algorithms called ACOPSO. The first step is identifying CHs, followed by the shortest distance communication process to gather sensed information from CHs. CHs are connected to sinks by the shortest path calculated by ACOPSO. ACOPSO used comprehensive sensing to reduce the packet size in WSNs. The simulation results demonstrate that the proposed protocol prolongs the network life span compared to GSTEB based on stability measurements. In this system, all nodes are arbitrarily distributed in an area of 100 by 100 square meters, with the base station located 50 meters apart.
Sabor, and Abo-Zahhad [45] developed the Unequal Multihop Balanced Immune Clustering (UMBIC) approach to enhance the lifespan of heterogeneous/homogeneous WSNs with unequal densities at small and large scales. To manage both inter-and intra-cluster energy consumption, UMBIC uses Uneven Clustering Methods (UCM) and a Multi-objective Immune Algorithm (MOIA). In UMBIC, the UCM divides the network into clusters of different sizes based on the distance to the base station and remaining energy. With MOIA, the most advantageous clusters are identified, and a routing tree is created between them, ensuring connectivity between nodes and reducing communication costs. The UMBIC protocol enhances the lifespan of the networks, eliminates hotspots, and optimizes energy consumption. Additionally, UMBIC reduces overhead and simplifies computation since the base station gathers information only once.
A novel chemical reaction optimization (nCRO) paradigm is used by Srinivasa Rao and Banka [46] to propose unequal clustering and routing algorithms. The nCRO-UCRA method was developed to resolve the hot-spot problem using the nCRO method, which is a novel chemical reaction optimization. To accomplish this, nCRO-UCRA formulates linear programming for two main issues: multi-hop routing and the choice of CHs. The network is divided into uneven-type clusters. CHs are selected on the basis of a derived cost function. Finally, nCRObased routing is applied to select a next-hop CH based on the degree of each node and distance. The encoding of molecular Fuzzy and Ant Colony Optimization Based Combined MAC, Routing, and Unequal Clustering Cross-Layer protocol developed by Gajjar, Sarkar [47], named FAMACROW, combined hierarchical energy-efficient clustering with media access to reduce energy consumption and prolong network life. In the fuzzy logic system, three factors are used to determine CHs: remaining energy, node density, and communication link quality. As a solution to the hot-spot issue, FAMACROW employs an uneven clustering method. In the inter-cluster routing protocol, the relay node is chosen based on the distance between the CH and the base station, the remaining energy, congestion control length, and delivery probability. Compared to the IFUC, FAMACROW provided a 75%-88% increased network lifespan, 82% extra packets, and 41% improved energy efficiency.
Xunli and Feiefi [48] proposed Sink Mobility based Energy Balancing Unequal Clustering (SMEBUC). In energyrestricted WSNs, coverage and connectivity are two main QoS factors that affect the efficiency of WSNs by decreasing the energy consumed by nodes and extending the coverage lifetime. The distribution of nodes in cluster-based WSNs leads to increased energy inequality. SMEBUC solves these problems by selecting the CHs based on the energy and by making the clusters unequal in size using the improved Shuffled Frog Leaping Algorithm (SFLA). The CHs are used constantly to identify node weights and CH replacement times. The objective is to determine which relay node between CH and the base station would be most appropriate. Additionally, SMEBUC employs the sink location mobile algorithm to prevent the appearance of hot-spot problems. The results of the experiments indicate that SMEBUC outperforms LEACH and EBUCP, achieves more energy efficiency, and provides a solution to the hot-spot problem in a multi-hop routing protocol by replacing CHs after a period of time.
The compact Bat Algorithm (cBA) was used by Nguyen, Pan [49] to present a novel optimization algorithm for the uneven clustering problem in WSNs. The cBA generated new candidate solutions for space search based on a probabilistic model, which has proven to be promising so far. An explicit probabilistic model was built, and samples were taken from the built models to find the candidate solution. In order to construct the operations of selecting and optimizing behaviors, cumulative distribution and probability density functions are used extensively. Probabilistic modeling is a valid alternative optimization strategy for devices with restricted hardware capabilities. The presented algorithm achieves the efficient use of limited memory devices and provides competitive results.
Zhu and Wang [50] propose an energy-balanced unequal clustering routing protocol based on the improved sine cosine algorithm (DUCISCA). As a first step, DUCISCA uses a timebased algorithm for cluster head competitions. The number of neighbors, the residual energy of the candidate CHs, and the distance to the base station determine the broadcast time in this algorithm. Secondly, a competition radius based on node distance and residual energy is proposed. Energy consumption can be balanced between nodes at different locations to prevent hot spots. It also implements a time-based broadcast mechanism. Finally, the improved sine-cosine algorithm (ISCA) based on Latin hypercube sampling and adaptive mutation is introduced to improve convergence and jump out of the local optimum. The experiment results indicate that DUCISCA balances energy consumption across the entire network and extends the lifetime of the network more effectively than previous approaches.

3) Fuzzy-based approaches:
The presence of uncertainties in WSNs requires the use of the fuzzy logic system to make effective and efficient decisions. Various clustering algorithms use fuzzy logic to make effective decisions about clustering. Over classical approaches, fuzzy logic offers lower execution complexity, development costs, less memory, greater flexibility, and auto fault tolerance. Fuzzy inputs for selecting the cluster heads are remaining energy, the density of nodes, distance to the base station, and distance among the neighbor nodes. Fuzzy outputs offer the option to select the CH and the size of the cluster. Several deterministic fuzzy-based clustering strategies have been proposed to minimize clustering energy consumption. Baranidharan and Santhi [51] offered Distributed load balancing Unequal Clustering (DUCF). Fuzzy logic is used to select the CHs. DUCF arranges clusters of unequal sizes in order to keep energy usage balanced among them. The CHs are chosen using a fuzzy inference procedure incorporating residual energy, degree, and distance between the base station and the nodes. DUCF limits cluster sizes based on fuzzy input parameters. DUCF balances the load between clusters by varying cluster sizes. Multi-hop data transmission by DUCF conserves energy by balancing energy consumption among nodes and reducing energy usage in CHs. DUCF produced better results in various network scenarios than LEACH, CHEF, and EAUCF.
Multi-objective fuzzy clustering algorithm (MOFCA) is offered by Sert, and Bagci [52]. Hot spots can currently be solved by developing unequal-sized clustering methods, which generate smaller clusters to reduce the intra-cluster relays when reaching the base station. Changing the location of deployed nodes may also introduce the energy-hole problem. Prior studies have not considered both problems (Hot-Spot and Energy-Hole) jointly in uniformed or non-uniform distributions. Thus, MOFCA was a solution to static and dynamic network problems. MOFCA uses the remaining energy, the distance to the base station, and the density of nodes as input variables for the fuzzy inference system to overcome the uncertainty in WSNs in nature. Results show that MOFCA is a more energy-efficient protocol in terms of FND, HND, and total residual energy. MOFCA considers both dynamic and static nodes, and mobility is implemented by changing the location of nodes without consuming any energy.
The Secure Unequal Clustering Protocol with Intrusion Detection (SUCID) proposed by Maheswari and Karthika [53] www.ijacsa.thesai.org achieved QOS variables such as security, lifespan, and power. It contains several different procedures, including initialization of the nodes, election of the TCH, the election of the FCH, cluster maintenance, and identification of intruders. Using the neuro-fuzzy grouping method, the tentative CHs (TCHs) were initially selected based on the three input variables, distance to the base station, residual energy, and distance between the nodes in close proximity. Using "DHO: Deer hunting optimization" based on TCH, final CHs and optimal cluster heads were selected based on the fitness derived from five variables, including distance to the base station, remaining energy, degree of nodes, the centrality of nodes, and link quality. Cluster maintenance was used to balance the load, and a deep belief network was used on the cluster heads to detect intruders in the WSN. Results from experiments showed that SUCID maximized energy efficiency, network lifespan, delayed average, and intruder detection rate by combining DBN and IDS.
Siqing, Yang [33] introduced Fuzzy Logic for Multi-hop Networks (FLCMN), which partitions the time into rounds with equal intervals. Each round comprises two stages: clustering and data transmission. The FLCMN algorithm uses fuzzy logic inference to determine the CH. There are three input parameters for fuzzy logic inference systems: density of nodes, residual energy of neighbors, and the average remaining energy of neighbors. Cluster Head Formation is the subsequent phase, where nodes broadcast a message, and after comparing the degree, a node is selected as the CH. Once CH has sent a message to join, nodes select their nearest CH. FLCMN supports multi-hop data transmission. Hot-spot problems can be prevented by FLCMN, as well as the distribution of nodes within clusters can be regulated. As part of FLCMN, clusters are also connected in a multi-hop manner in a Fibonacci series sequence, which prevents unnecessary data transmissions by the CHs, thereby extending the life of WSNs. The FLCMN algorithm performs better regarding network lifetime and energy consumption than existing DFLC, LEACH, and EAMMH algorithms when comparing network lifetime and energy consumption between nodes.

C. Probabilistic Uneven Clustering Approaches
CHs are selected at random without regard to predetermined protocols in probabilistic clustering algorithms. These types of algorithms are simple, energy-efficient, and optimized. Probabilistic clustering is primarily used to extend the life of WSNs. Energy-efficient clustering requires a low message and time complexity. Algorithms for probability clustering comprise two types: random and hybrid algorithms.

1) Random probabilistic algorithms:
In random probabilistic algorithms, CHs are randomly chosen, simple in nature, and have the lowest overhead. Lee, and Choe [54] presented a mathematical framework and developed a clustering algorithm called the Location-based Unequal Clustering Algorithm (LUCA). Cluster sizes are determined by their distance from the CH. The cluster size of LUCA clusters increases as the distance from the base station increases to diminish the energy consumption of the whole network. According to simulation results, LUCA outperforms the conventional equal clustering algorithm in terms of energy efficiency.
Kim and Hussain [55] suggest a randomized and distributed clustering algorithm, PRODUCE, that includes clusters of unequal size. CHs located near the base station may concentrate on inter-cluster communication, whereas CHs farther away may focus on intra-cluster communication. This prevents excessively long-distance communications in the network from negatively affecting signal strength. Simulation results demonstrate that the proposed algorithm considerably boosts coverage time and network lifetime at high node densities, especially when coverage time and network lifetime are important factors.
Huang and Hong [56] present an energy-efficient multi-hop routing algorithm based on grid clustering, called EEMRP. Several parameters, including network area levels, the position of nodes, and the energy of nodes, are considered by the algorithm to minimize energy consumption. Cluster heads are relieved of the burden of transferring data among clusters via multi-hop routing by introducing communication nodes. EEMRP extends the lifetime of the network by 17.5-25.2% compared to other algorithms, according to simulation results.
Handy, Haase [57] modified the LEACH protocol and extended its stochastic CH selection algorithm by a deterministic component. The network lifetime is increased by about 30%, depending on the configuration. Further, they developed a three-factor metric to define the lifespan of microsensor networks, the last node dies, half of nodes alive, and the first the node dies.
2) Hybrid probabilistic algorithms: In hybrid probabilistic algorithms, CHs are selected using a hybrid method, which involves randomness and parameters like distance from the base station, residual energy, node density, etc., which balance the clusters. Due to the iterative and competition-based nature of hybrid probabilistic algorithms, these algorithms have a higher message and time complexity.
Li, and Liu [58] examine unequal clustering in a uniform distribution of WSNs from a theoretical point of view. They then develop an efficient clustering scheme to minimize energy consumption. Additionally, lightweight and optimal protocols are designed for the routing and rotation of cluster heads to ensure even power consumption. Simulation results indicate that the proposed approach extends network lifetimes over the best-so-far unequal clustering-based routing approach.
Bozorgi and Bidgoli [59] propose improving the energy efficiency of previous clustering methods and extending the lifetime of the network with a hybrid unequal clustering method. A novel clustering strategy is used in the proposed protocol. Nodes in a network use their neighbors' information based on the arrangement of nodes. By using this strategy, overhead can be reduced significantly. Clustering results in nodes nearer to the base station receiving and relaying data with more energy than nodes farther away. As a hybrid staticdynamic clustering scheme, no control message must be transmitted at each round to reduce overhead. There are two new routing techniques proposed. Member nodes support CHs www.ijacsa.thesai.org with sufficient energy and distance to share the cluster's load with cluster heads. Discretion licensing is another new technique that prevents incomplete packets from being sent in real-time. According to simulation results, the proposed method reduces network overhead, improves network stability, balances energy, and extends the lifetime of networks.
Priyadarshi, Singh [60] propose a novel Hybrid Energy Efficient Distributed (HEED) algorithm for networks with non-uniformly distributed nodes. A novel HEED protocol has been developed for extending the lifetime of clustered non-uniform sensor networks. HEED and its variants are compared based on the amount of energy dissipated and the number of live nodes. The planned HEED protocol has a longer lifespan and is more energy efficient than the existing HEED protocol.  As comparison tables show, CHs are chosen using a hybrid probabilistic method that takes into account randomness and a variety of factors, including residual energy, distance from the base station, node density, etc. Higher message and time complexity result from hybrid probabilistic algorithms' iterative and competition-based nature. Deterministic clustering algorithms manage clusters and choose cluster leaders wisely. Based on factors including remaining energy, node density, and distance from the base station, deterministic clustering algorithms choose cluster heads. Tables I, II, and III. As a starting point, clustering algorithms are evaluated according to the following characteristics: size (even or uneven), cluster count (static or dynamic), and type of communication (multi-hop or single-hop). In the next step, we compared several clustering algorithms based on the characteristics of cluster heads. For example, mobility can be fixed or mobile, nodes can be heterogeneous or homogeneous, roles can be aggregated or relayed, methods like probabilistic, deterministic, and preset, and objectives for each algorithm. In addition, we examined several clustering algorithms based on selecting cluster heads, the nature of the cluster, and its location awareness.

Tabular comparisons of uneven clustering are presented in
In uneven clustering, three protocols are distinguished: preset, probabilistic, and deterministic. There are two types of probabilistic clustering: random clustering and hybrid clustering. Although random-type clustering fails to conserve energy, it is simple and achieves very low overhead. A hybrid probabilistic algorithm selects CHs based on randomness and parameters such as residual energy, distance from the base station, node density, etc. Hybrid probabilistic algorithms' iterative and competition-based nature leads to higher message and time complexities. Clustering algorithms based on deterministic principles control clusters and elect cluster heads optimally. Deterministic clustering algorithms select cluster heads based on parameters such as remaining energy, node density, distance from the base station, etc. These parameters can be obtained locally, and the neighbor nodes exchange messages to update them. There are three deterministic clustering algorithms: fuzzy-based, compound-based, and heuristic-based.
Clustering algorithms based on fuzzy logic consume more power during algorithm execution and message exchange. It is impractical for various applications to use heuristic-based clustering algorithms due to the need for global information and the fact that the base station fully manages the algorithms. Depending on the application, this is the most appropriate choice. Clustering algorithms based on weights are iterative, which adds complexity to messages. In compound-based unequal clustering, different metrics are used, such as Sierpinski triangles and linked graphs. Deterministic compound-based methods are proposed to make clustering more energy-efficient.
Most clustering methods are static at the moment, and they cannot adapt to network changes. Therefore, dynamic processes of clustering may prove more effective in the future. The mobility of networks is not intended in this case. For clustered WSNs, we have three components: members of clusters, cluster heads, and finally, the base station. WSNs with mobility require frequent configuration changes, increasing overhead. The WSN is typically data-centric and requires datacentric methods, as discussed earlier. Most algorithms focus on proactive networks, but very few are reactive. Clustering methods may become more reactive in the future.

V. CONCLUSION
The purpose of WSNs is to provide valuable information over long periods with the least amount of energy consumption, exhibiting optimal performance with reduced delays. The low capacity of the battery may, however, pose a serious problem regarding energy consumption. Clustering is found to be an effective technique in power management in WSNs. However, clustering is susceptible to hot-spot problems. Using uneven-type clustering, the load is distributed www.ijacsa.thesai.org equally, the hot-spot problem is resolved, and the WSN's lifetime is enhanced. We classified the various uneven clustering algorithms into three broad groups: preset, deterministic, and probabilistic clustering algorithms. The methods are described in terms of their goals, attributes, categories, advantages, and disadvantages. Probabilistic clustering is used when we need simplicity and speed. We employ this type of clustering when we need WSN surveillance on a large scale. To implement more robust and reliable applications, we need deterministic clustering. The heuristic type clustering can be used when we need an optimized solution for an application-specific situation. In this paper, we also compared all these types of protocols based on their clustering properties, CHs properties, and also on the type of clustering process.