Moving Object Detection over Wireless Visual Sensor Networks using Spectral Dual Mode Background Subtraction

Wireless Visual Sensor Networks (WVSN) play an essential role in tracking moving objects. WVSN's key drawbacks are storage, power, and bandwidth. Background subtraction is used in the early stages of target tracking to extract moving targets from video images. Many standard methods of subtracting backgrounds are no longer suitable for embedded devices because they use complex statistical models to manage small changes in lighting. This paper introduces a system based on the Partial Discrete Cosine Transform (PDCT), reducing the vast dimensions of processed data while retaining most of the important information, thereby reducing processing and transmission energy. It also uses a dual-mode single Gaussian model (SGM) for accurate detection of moving objects. The proposed system's performance is to be assessed using the standard CDnet 2014 benchmark dataset in terms of detection accuracy and time complexity. Furthermore, the suggested method is compared to previous WVSN background subtraction methods. Simulation results show that the proposed method consistently has 15% better accuracy and is up to 3 times faster than the state-of-the-art object detection methods for WVSN. Finally, we showed the practicality of the suggested method by simulating it in a sensor network environment using the Contiki OS Cooja Simulator and implementing it in a real testbed using Cortex M3 open nodes of IOT-LAB. Keywords—Background subtraction; discrete cosine transform; embedded camera networks; Gaussian mixture models; wireless visual sensor networks


I. INTRODUCTION
Wireless sensor networks (WSNs), which are made up of thousands of scalar sensors nodes that are spatially distributed and wirelessly communicated, have attracted researchers' interest [1]. Small and low-power CMOS cameras and microphones are used in Wireless Visual Sensor Networks (WVSNs), which can collect visual cues from the environment. The WSN's capabilities are being expanded to include sophisticated environmental monitoring, advanced health care delivery, traffic avoidance, fire prevention, and monitoring, as well as object tracking, and modern surveillance systems [2]. WVSN has focused on military, commercial traffic management, and precision agriculture surveillance applications [3]. Three major problems make WVSNs lack vision processing capability. First, sensor nodes' visual processing capability, second, memory storage constraints for sensor nodes and Finlay; communication of large volumes of image data. However, maximising network lifespan while processing huge volumes of multimedia data while following application-specific QoS requirements such as latency, packet loss, bandwidth, and throughput is a challenge. In addition to developing energy-sensitive multimedia processing algorithms and infrastructures, it is also necessary to establish efficient communication strategies [3].
Object detection is the first and most critical step in target tracking [4]. Robust object detection is typically the dominant consumer of processing and resources, where the moving targets are extracted from the video frames to perform further high-level processing. Lighting changes, shifting backgrounds, artificial or fast motion, and occlusion make accurate foreground object segmentation challenging [5]. The major methodologies for completing the object detection task include optical flow [6], frame differencing [7], and background subtraction [8].
Background subtraction is a standard and consistent method for detecting moving foreground that involves subtracting the background model from the current frame and changing the background model on a regular basis to remove the effects of illumination and inappropriate events. This method is extensively used for motion detection tasks in dynamic scenarios. In practice, basic techniques like mixture of Gaussians (MOG) [9], KDE [10], codebook [11], and ViBe [12] are employed for real applications. Despite the accuracy and efficiency of the MoG [9], the evaluation in [13] demonstrates that MoG can only handle three frames per second on the Blackfin DSP camera nodes with a low image resolution frame size of 320 × 240. The need to update the MoG probability distribution parameters accounts for the long computation time of MoG.
This work aims to investigate the development of moving object detection over WVSN. The Discrete Cosine Transform (DCT) [14] is a frequently utilised image compression technique over WVSN [15,16]. The DCT algorithm converts signals from the spatial domain to a frequency domain representation. We apply the DCT to minimise the dimensionality of the background subtraction problem while *Corresponding Author. 514 | P a g e www.ijacsa.thesai.org maintaining accuracy. The following are the contributions made by this paper: • A new compression-based background subtraction called Spectral Dual Mode Background Subtraction (SDMBS) uses Partial Discrete Cosine Transform (PDCT) [15] (for dimensionality reduction) and Dual mode SGM [17] (for accuracy) to model the background and distinguish the foreground from the background.
• We implement our approach and compare it to MoG and other compressed-based MoG methods to demonstrate the computational efficiency of our suggested methods. According to the results, our method is up to 10 times faster than the original MoG and three times faster than the compressed-based MoG.
• To demonstrate the algorithm's ability to work in wireless sensor network environments, we simulated and realised the proposed SDMBS in a Cooja network simulator and on the IOT-LAB M3 board.
The rest of the paper is organized as follows. We first present the related work in Section II. We then present a detailed account of the proposed SDMBS approach in Section III. Section IV discusses the simulation results and performance evaluation in detail. Section V draws the paper's conclusion.

A. Object Detection in WVSN
In visual sensor networks, the cost of data communication is usually far higher than the cost of image processing. As a result, traditional object detection methodologies are ineffective for monitoring and surveillance applications; instead, the image raw data is sent to the sink node, where detection methods are used to determine the moving object. Alternative approaches are to either compress the image at the sensor node and apply object detection at the sink node after decompression, or process the frame before transmission and transmit the useful information or features for further analysis at the sink node. Compression can be applied using Compressed Sensing (CS), wavelet, or DCT. In the second approach; frame processing is applied either on raw image data or compressed domain to further reduce processing complexity at the sensor node. The compressed data is already computed and has less storage space than the raw image frame. The two approaches are briefly reviewed in this section.

1) Compressed data:
According to Robust Primary Component Analysis (RPCA) [18], DECOLOR [19], the basic concept of low-order factorization structures and sparse factorization is to divide a given matrix of acquired frames into background and sparse foreground by outliers. The goal of Compressed Sensing CS (low-rank BS) [20] is to send a compressed image to the base station using Compressed Sensing (CS) [13] and then use Orthogonal Matching Pursuit (OMP) [21] to rebuild the image at the receiver end. The authors of [22] proposed a CS-based detection approach that uses CS measurements of a moving object to reconstruct the foreground in a video.
2) Processed data: Because the video to be sent in surveillance applications is generally static, a resourceconstrained environment like WVSN does not require the transmission of the entire video. The video can be processed using a compression-based background removal technique to recognise moving objects and send only the foreground data to the monitoring location to save energy and bandwidth. A method for sending image portions instead of the whole image us describes in [23]. It ensures that the sink node receives the bare minimum of image content, as assessed by in-node energy consumption and reconstructed picture peak signal-tonoise ratio (PSNR). The image processing block (Running Gaussian Average technique for object extraction and DWT for ROI transmission) operates at a high frequency to facilitate rapid processing and is only engaged by a separate network processor when images need to be analysed [24]. Because it runs continually, the network processor block is designed to operate at a low frequency. The suggested approach for image processing and communication requires relatively little energy, as evidenced by practical test and simulation results. To save transmission energy, Nandhini et al. [25] propose a method for detecting objects with fewer measures that combines a mean measurement differencing approach with an adaptive threshold strategy.
CS-based background subtraction is measured based on the node before object information is sent, reducing complexity in terms of power, storage, and bandwidth. CSMOG [26] applies MOG [9] to low-rate CS measurements. CSMOG [26] is based on the idea of reducing the number of dimensions in data while still capturing the majority of the information via a random projection matrix. The CSMOG method is consistently superior, up to 6x faster, and uses significantly fewer resources than the standard method, according to real-time requirements. The DWT-based CS object identification framework [27] uses a simple measurement matrix termed the deadweight tonnage block diagonal matrix to refine the pixel-based foreground following the block-based foreground recognition phase in the first stage. The averaging approach using the Adaptive Threshold Technology (MMDATS) in [25] is based on the framework for robust subspace learning. The OMP approach is used to reconstruct the object from foreground measurements. Due to its excellent directional selectivity and shift-invariance, [28] uses a motion segmentation algorithm based on interframe differentiation using the complex Daubechies wavelet transform in the wavelet compression domain.
To reduce the storage space and time required, a background statistical subtraction approach [29] based on motion segmentation in the compression transform domain using Wavelet has been proposed. A good observation was made in 8x8 blocks using the DCT coefficients of the precoded JPEG image [22]. They developed a background subtraction strategy that properly depicts the background model over time using competing Hidden Markov Models (HMM). Three techniques for modelling the background directly from the compressed video are presented in [30]. 515 | P a g e www.ijacsa.thesai.org Moving average, median, Gauss blending. These methods use the DCT coefficient (including the AC coefficient) to characterise the background at the block level, and then update the DCT coefficient to match the background. Popa et al. [31] use low DCT compressed area processing to simulate the background. Processing at the block level instead of the pixel level reduces the number of simulation parameters by almost one-third. They also reduce the number of coefficients per block from 64 to 16 while retaining segmentation quality. In the DCT domain, Ye et al. [32] evaluated the background stability and separability of objects. The suggested method restores the target by suppressing the background coefficients by modelling the background as a single Gaussian model for each frequency point. A quaternion-DCT for infrared target recognition is presented by [33]. This approach shows how to create a quaternion with two-directional features (motion feature and kurtosis feature). The QDCT drawing feature acts as a unique signature that helps solve problems when finding small targets. To reduce complexity and simplify hardware implementation, Manimozhi et al. [34] employed a diagonal matrix of binary substitution blocks as the measurement matrix for both DCT-based and DWT-based CS procedures.
According to related research, a large volume of video is required, as well as a significant amount of storage space and processing time for the segmentation method. Compressionbased processing is recommended for restricted WVSNs to address the above issues. As a result, we'll describe a motion segmentation method using the DCT in the compression transform domain based on statistical background subtraction. The dual-mode SGM-based background subtraction technique recognises just the foreground blocks of the discrete cosine transform's detailed component to reduce processing complexity. Then, adjust the foreground block to recognise the foreground object. The foreground block is moved to the sink side for rebuilding and tracking. In Fig. 1, the proposed SDMBS (page size = 4) is compared to the original MOG [9] and with block measurements based on compressed sensing (CSMOG) [26].

A. Network Model
We are considering randomly deploying WVSN nodes in the surveillance field. Each WVSN node is constrained in terms of process and memory resources. The WVSN system model is composed of visual sensor nodes, Relaying Nodes (RNs), one or more Monitoring Node or Sensor node (MNs), and a Sink Node (SN) [23]. Each sensor node is thought to be in a 'wakeup' state according to a unique duty cycle ∈ [0, 1] during a period to successfully send an image via the network. Thus, it avoids any conflicts induced by two or more nodes simultaneously broadcasting image data. Thus, each sensor is awake for a length of time and sleeps for a length of time (1 − ) . The frame count is set to zero when a sensor node enters a 'wake up' condition.

B. Pre-Processing
Simple spatial Gaussian filtering and median filtering are used to suppress salt and pepper and Gaussian noise in images captured during the preprocessing step [27]. The filtered frame is then divided into equal-sized blocks, with the SDMBS algorithm applied to each block separately. This can be done in parallel, further reducing computation time.

C. Discrete Cosine Transform (PDCT)
As seen in Fig. 2, each video frame is subsequently divided into 8 × 8 blocks. After that, each block is subjected to DCT. Each 8 × 8 DCT block is represented by the first ten lowfrequency DCT components. The partial DCT has the advantage of compressing an 8 × 8 block into 10 samples, which is useful for WSNs with limited resources. Although the rest of the data is sparse, the DCT DC-coefficient stays concentrated in the series' upper left corner. Compressed sensing CS [25] requires a sparse value.

D. Dual Mode Signal Gaussian Model (DM-SGM)
To deal with the inaccuracies that come from modelling the scene using SGMs [35], a dual-mode SGM with age [17] is utilised. While still learning the background reliably, this model safeguards the background model from foreground and noise contamination. The compressed domain PDCT low frequency components are subjected to DM-SGM to identify whether or not the image block contains a moving target. Here, the Gaussian parameter for each grid is computed. Mean, 516 | P a g e www.ijacsa.thesai.org variance, and age are then used to model the background, determining and updating the foreground blocks. There are two models for each block; appearance background models and candidate background models. The candidate background model is ineffective until its age exceeds that of the apparent background model. This dual-mode SGM differs from twoversion Gaussian combination models (GMM) [9] in that, with a bi-modal GMM, the foreground facts could still contaminate the history. However, with the dual-mode SGM approach, this is no longer the case.
The two models are switched at this point. At the end, the foreground blocks are determined and applied to the pixel refining stage to detect the pixels containing the target within the foreground block, according to the flowchart in Fig. 3.
The group of pixels in grid at time is denoted as ( ) , the number of pixels in ( ) as � ( ) � , and the observed pixel intensity of a pixel at time as ( ) , and the mean ( ) , the variance ( ) , and the age ( ) of the SGM model applied to ( ) is updated as DM-SGM [17] uses another SGM as a prospective background model. At this point, the candidate background model is rendered ineffectual until it reaches the same age as the apparent background model, at which point the two models are exchanged. We update the mean, variance, and age of the candidate background model and the apparent background Where ( ) is the observed mean and is a threshold parameter. Also, we update , ( ) , , ( ) , and , ( ) , according to (1), (2), and (3).
If condition (7) is violated, and if the observed mean matches the candidate background model, then If none of the conditions hold, we start the candidate background model with the current observation. Only one of the two models is altered when this process is used, while the other is left alone. If the candidate's age exceeds the apparent meaning, the two backdrop models for the grid are swapped after updating.
Once the candidate is exchanged, the background model is initialised. Finally, an apparent background model is solely employed to determine foreground pixels, as stated in Section E. preventing the background model from being distorted by the foreground data that represents the object. The candidate background model, rather than the apparent background model, learns the foreground data in the dual-mode SGM, preventing the background model from being distorted by the foreground data that represents the moving object in the frame. So, the models are changed and the correct background model is chosen if the candidate background model's age is greater than the apparent background models.

E. Pixels Refining
A foreground block contains both foreground and background pixels. Each video frame contains a large number of background blocks. As a result, we just need to focus on the small number of foreground blocks. To detect which pixels in a foreground block are indeed foreground, a basic background learning technique for each block is created. If we classify a pixel in a group as a foreground pixel, 517 | P a g e www.ijacsa.thesai.org where is a threshold parameter, So, instead of the apparent background model learning the foreground data, the candidate background model learns it. Additionally, the correct background model will be chosen if the candidate background model's age is greater than the apparent background model's, where the models will be swapped. As a result, we don't have to be concerned about the model learning inaccurate foregrounds.

F. Computation Complexity
The quantity of elements processed in every frame determines the difficulty of the computation. We can only evaluate the computing complexity of one block because each frame is divided into equal-sized blocks of size 8 × 8 pixels. Because each frame is broken into blocks of 8 × 8 pixels of similar size, we may calculate the computing cost of a single block.
• For the CS process, we consider the original MoG [9], where each pixel is modelled by 3 Gaussians, which means that we need 64 × 3 Gaussians per block.
• For CS-MoG [26], where each projection value requires three Gaussians, the number of Gaussians required per block is 8 × 3 (a factor of 8 reduction).
• For our proposed method, each block is modelled by 2 Gaussian DM-SGM and we proceed over the 10 lowfrequency DCT components, which require 10 × 2 number of Gaussians for each block, a reduction by a factor of 9.6 and 1.2 per block w.r.t. the original MoG and CS-MoG, respectively. Experiments show that it is 2.5 times faster than CS-MOG in processing time.

G. Scene Reconstruction
When an image arrives at the sink node, it is superimposed on the previously received reference frame. Because the suggested technique only communicates a fraction of the entire image, the pixel coordinates at the MN node stay unchanged. This allows a portion of the transmitted image to be used to replace pixels in the reference image at the sink node more efficiently. The pixel values are, however, subject to channel distortion due to the transmission environment.

IV. EXPERIMENTAL RESULTS AND ANALYSIS
In this section, experimentation and performance evaluation are done to determine the relevance of our proposed method. The experimental dataset and setup are explained, then the qualitative analysis is shown to illustrate the performance of our system, and evaluation for quantitative and execution performance is done to test the accuracy and running time. In addition, the algorithm is also simulated in a sensor network environment using the Cooja Simulator of Contiki OS [36,37] and realised in a real testbed using IOT-LAB [38].

A. Dataset and Setup
We will present the results of our compressed domainbased moving object detection technique on a standard benchmark dataset, CDnet 2014 1 [39], to demonstrate its effectiveness. The CDnet 2014 data set is divided into 11 categories with different challenges, each of which contains four to six video sequences. Each video sequence consists of 600 to 7999 frames, with resolutions ranging from 320×240 to 720×576. The simulations were run on an Intel Dual Core i7 3.6GHz processor with 8GB of RAM. The code is written in the C++ language. The total number of frame sequences in each dataset was averaged during the experiments. Fig. 4 and 5 show the results of our moving object detection technique, Spectral Dual-Mode Background Subtraction (SDMBS). Fig. 4 exhibits performance for some of the representative frames from CDnet 's different categories to show performance against all the CDnet 2014 challenges. Fig. 4 and 5 demonstrate the ground truth and detected object discoveries from the original video frames. Comparing the resulting foreground mask to the relevant ground truth demonstrates the robustness of our suggested strategy for detecting moving objects across different categories.

B. Qualitative Analysis
Most of the CDnet 2014 challenges have excellent qualitative performance; nevertheless, the PTZ and camera jitter categories, as shown in Fig. 4, have poor qualitative performance. The worst performance Due to the zooming and moving features of this category, a compensation stage is required before the object detection stage to compensate for the frame movement. Because of the ghosting artefacts created in the videos in this category, the Intermittent Object Motion category is noisy. Background items moving away, abandoned objects, and objects stopping for a brief moment before moving away are the key features of this category. Shadow appearance in the shadow category affects the performance and the foregrounds are not detected completely. When we compared our results to different existing methods published on the CDnet website [39], we identified MOG [9], KNN [40], ViBe [12], and SubS [41] as candidates. Thus, we compared our proposed compressed-based background subtraction SDMBS with recent and state-of-theart methods [26,42], classical methods like [9,40], and fast methods like the ViBe [12] Background Subtraction Algorithm.
In [26], a block-based MOG is designed to be processed using the compressed sensing CS elements of the frame-blocks CS-MOG and is targeted at WVSN applications, whereas [42] is a background model update algorithm that uses an intermittent technique along with an adaptive block-learning algorithm.
The results of three video sequences, Highway (baseline), Fountain2 (dynamic background), and Snowfall (bad weather), are illustrated in Fig. 5. The original video frame for the three datasets and its corresponding groundtruth are shown in the top two rows. The results of MOG [9], Vibe [12], two current state-of-the-art techniques [26,42], and SDMBS are shown in the next five rows (from top to bottom). In the last row of Fig. 5, we demonstrate a qualitative comparison of our proposed technique with other current methods, revealing that our method outperforms several of the existing systems. From the results, it is observed that our system accurately recognises foreground objects and has a considerably high resemblance to the ground truth when compared to other examined systems.

C. Quantitative Analysis
In the quantitative evaluation, our method is compared to widely popular and state-of-the-art object detection algorithms for WVSN by conducting experimentation on the benchmark (CDnet 2014) dataset [39]. Several evaluation metrics are utilised to provide a credible measure of the outcome. Average recall (Re), precision (Pr), and F-measure (Fm) for all the video sequences in each category are listed in Table I. True positive (TP), false positive (FP), true negative (TN), and false-negative (FN) are the four types of pixel-based count metrics that can be created using the available ground truth data [39].
As the frequency of false negatives decreases, the value of Recall (Re) increases, which is used to measure the degree of completeness of the recognised foreground. Precision (Pr) is a metric measuring how accurate the identified foreground is, with a lower value when there are a lot of false positives. Fmeasure (Fm) is a metric for determining the balance of recall and precision with equal weights, implying that it is high only when both recall and precision are high. The three evaluation metrics, recall (Re), precision (Pr), and F-measure (Fm), are only considered to avoid redundancy. The best and second-best performing approaches for each category, based on the average Fm for all the video sequences, are noted in red and bold in Tables II and III. When compared to classical methods, SDMBS may only be competitive in some areas, such as dynamic background, low frame rate, and bad weather. While there are approximate results for most categories with SubS [41] when SDMBS is ranked (2nd), this can be explained in terms of the design trade-off. While; when compared to state-of-the-art methods [26], we achieve a 15% increase in accuracy than CS-MOG [26] which is a compressed-based background subtraction applied for WVSN. In Fig. 7, the execution speed of SDMBS is compared to that of other methods at two resolution scales (320240 and 640480). For the two resolution scales, SDMBS excels in terms of speed. As seen in Fig. 7, SDMBS provides equivalent results to FBS-ABL [42], although it is more accurate, as seen in Fig. 6. When compared to other block-based techniques, this demonstrates SDMBS's effective design strategy. 519 | P a g e www.ijacsa.thesai.org  Fig. 6 and 7 highlight the trade-off between detection performance and execution speed, and as can be seen, extensively adaptable approaches have fast/practical execution at the cost of diminished performance. We achieve a 2.3x improvement in frame rate (FPS) over CS-MOG [26], a compressed-based background subtraction method used for WVSN. This shows an efficient decrease in processing time. 520 | P a g e www.ijacsa.thesai.org

D. Sensor Network Simulation
This section illustrates the capability of the proposed system to work in WSN environments: first, simulation is carried out over Cooja of the Contiki OS Network Simulator [36], [37] to add the effect of lost packets and throughput. Second; the system is released on a real testbed using IOT-LAB [38]. Traffic trace files are used in the real testbed and simulated environment [15].

1) Cooja simulation:
Four sensor nodes are installed. The sink is located at the left upper node (node 1) of the network area of 100 m × 100 m square grid. The destination node is located at (node 4). The simulation uses two datasets: pedestrians and PETS2006 (baseline) videos. The detection of moving objects is carried out at the host to select the blocks containing moving objects, and the blocks are then sent and routed through intermediate nodes to the base station. The received blocks at the destination are reconstructed to show the moving target, Fig. 8. Fig. 9 shows the received image PSNR for two approaches: First, the full transmission of the image frame (Full Tx), while the second is our approach to transmitting the important portion of the image containing the moving object (Partial Tx). Although the proposed approach has a lower PSNR ratio than the full transmission approach, however, the average value is 27db. PSNR and energy are calculated using [15].
The proposed approach was compared to the direct approach for the energy consumption analysis. In the direct technique, a multi-hop transfer of an entire image to the sink node is used. On the aforementioned datasets, Fig. 10 depicts the energy using both methodologies. As can be observed, the node's energy usage has been significantly reduced. The two datasets show that the suggested approach can be employed in real-time moving object detection systems since a portion of the image data, including object information, is received at the sink node with an appropriate range of PSNR values and less energy.   2) IOT-LAB realization: IoT-LAB 2 [38] is a large-scale WSN testbed that includes over 2000 wireless sensor nodes and a variety of processor architectures and wireless chips. IoT-LAB can be accessed through a web portal or by using the command-line tools. It allows users to retrieve experiment results and access serial ports on devices. Based on trace files as presented in [15], the IoT-LAB testbed M3 open nodes illustrated in Fig. 11(b) was employed in our experiments to replicate the intended object detection of the two datasets: pedestrians and PETS2006 (baseline). As shown in Fig. 11(a), the nodes m3-1, m3-10, m3-15, and m3-16 are used as senders, and m3-24 (blue circles) is used as a receiver to acquire varied loss rates as shown in Fig. 11(a). The sender (sender node) sends data packets according to the sender's trace file specifications (st-packet). The receiver (receiver node) maintains track of the packets it receives in a receiver trace file (rt-packet) as shown in Fig. 11. The sequence numbers of correctly received packets are received on the user's computer, which is used to reconstruct the video and calculate experiment metrics. Fig. 12 shows the results of applying the proposed moving object detection technique in IOT-LAB to the two datasets: pedestrians on the first row and PETS2006 on the second row. The foreground blocks are transmitted and routed to the sink node. The sink node decompresses the received block and determines the moving object's location. For surveillance applications, object location is the most important piece of information that requires further analysis. The object ROI is transmitted to the sink node correctly with minimum network resources, memory, and bandwidth. The energy is minimised with an accepted PSNR.

V. CONCLUSION
A background subtraction method that is both computationally efficient and accurate has been developed for object tracking across the limited resources of Wireless Visual Sensor Networks (WVSNs). To address the computation bottleneck of processing for constrained sensor networks, we use partial DCT to reduce the data dimensions while preserving the information content. In addition, energy-efficient blockbased dual-mode SGM is utilised for foreground block detection, where the image frame is divided into blocks and only blocks containing foreground pixels are further processed for the refining stage. The foreground pixels are determined and the moving object is located. In contrast to standard Compress Sensing CS, which compresses the entire frame, the target region of interest ROI in our proposed method is compressed, communicated, and routed toward the sink node for further analysis. Our experimental results show that our method is as efficient as traditional algorithms. Moreover, it is up to three times faster than the state-of-the-art WVSN object detection methods, and 15% more accurate. For embedded camera networks, we demonstrate that our suggested technique can accurately detect a moving object in real-time. We applied the proposed detection method in a WSN environment using Cooja of the Contiki OS Network Simulator. We verified that 2 https://www.iot-lab.info/ the energy required for transmitting the detected object to the sink node in our proposed detection method is lower than that of comparable methods at acceptable PSNRs. Finally; the system is released on a real testbed using IOT-LAB using testbed M3 open nodes.  522 | P a g e www.ijacsa.thesai.org