Noise and Restoration of UAV Remote Sensing Images

Remotely sensed images captured from a camera mounted on a UAV (unmanned aerial vehicle) are exposed to noise caused by internal factors, such as the UAV system itself or external factors such as atmospheric conditions. Such images need to be restored before they can undergo further processing stages. This study aims to analyse the effects of salt and pepper noise on a UAV image and restore the image by removing the noise effects. In doing so, a UAV image, with red, green and blue channel and containing regions of different spectral properties, is experimented with salt and pepper noise of different densities. Image restoration procedure is formulated using median filtering of variable sizes. Peak-signal to noise ratio (PSNR) and mean square error (MSE) analysis are performed to measure image quality before and after restoration. An optimal filter size is chosen based on the highest PSNR of the restored image. The results show that the effects of noise on UAV images are dependent on the spectral properties of the image channels and the regions of interest. The proposed restoration works best for images with lowcompared to high-density noises. Blue channel is found having the largest variation of optimal filter size, 18.5, compared to other channels because of the high response to noise within its short spectral wavelength region. Landscape’s vegetation has the largest variation of optimal filter size, 22, compared to other regions due to the sensitivity of its dark spectral properties. Keywords—Noise; restoration; remote sensing; UAV; PSNR


I. INTRODUCTION
Remote sensing technology has long been used for various purposes due to its spatial, spectral and temporal capabilities [1]. Initially, in the 1980s, satellite remote sensing technology was used to monitor various land covers continuously and offers cheaper costs compared to traditional approaches. Among the frequently used remote sensing satellites include Landsat, SPOT, IKONOS and Quickbird. These satellites have been actively used worldwide in numerous applications for more than 30 years; nevertheless, satellite imagery suffers limitations in terms of spatial and temporal resolution. Moreover, the satellite systems are operated by satellite operators in developed countries in which users do not have any autonomy over them. Other than that, images are sometimes unavailable for certain places and time besides exposed to other crucial issues, particularly cloud and haze effects [2]. Later, in the 1990s, besides satellite-based remote sensing technology, there were efforts to mount imaging systems on aircraft due to the need to capture images with higher spatial and temporal resolution as such onboard NASA Dryden DC-8 aircraft during the AirSAR PacRIM campaign; however, the operational costs were very expensive due to involving aircraft system maintenance [3]. Beginning 2000s, Unmanned Air Vehicles (UAVs) have been used to overcome the issues of using satellite-and aircraft-based technology; however, UAVs were initially massive and expensive in which the owners were normally among big organisations. Later in 2010s UAVs were then becoming affordable to many as well as smaller and lighter. Nowadays, a standard UAV is already equipped with an RGB camera and can be navigated autonomously. It has been currently used in various sectors related town planning, security, hazard monitoring, agriculture, environmental management and many more [4], [5]. Nevertheless, UAV-based remote sensing images tend to be exposed to noise from various internal or external factors [6]. Internal factors are caused by the UAV system itself including electronic and mechanical aspects of the UAV, while external factors are due to environmental issues such as haze, rain and fog. Such noise tends to modify the spectral properties and eventually degrades the UAV images qualitatively and quantitatively [7]. There exist studies on noise removal from satellite images however, comparatively, UAV images have significantly distinct spatial and temporal characteristics due to the different altitude and revisit frequency [7], [11], [17]. Therefore, this study attempts to investigate the effects of salt and pepper noise on a UAV image in which eventually, a noise removal procedure is to be proposed.

II. EVOLUTION OF REMOTE SENSING
Remote sensing satellites has initially been used to monitor various land covers due to its capability to capture images of large-scale agricultural land continuously and at an affordable cost. Among the frequently used remote sensing satellites www.ijacsa.thesai.org include Landsat, SPOT, IKONOS and Quickbird. These satellites have been used actively worldwide for more than 30 years. Remote sensing satellites were designed with multispectral sensors to enable efficient monitoring of various Earth's resources however, satellite imagery suffers limitations in terms of spatial and temporal resolution. Spatial resolution can be defined as the ability to separate details in an image. Technically, spatial resolution is a measure of the smallest object that can be resolved by the sensor, or the ground area covered by the instantaneous field of view (IFOV) of the sensor [8]. Temporal resolution is a measure of the repeat cycle or frequency with which a sensor revisits the same part of the Earth's surface. The frequency characteristics are determined by the design of the satellite sensor and its orbit pattern. The spectral, spatial and temporal resolution of different remote sensing satellites are given in Table I. In land cover monitoring, remote sensing satellites have sufficient spectral resolution to efficiently providing images of large areas, the drawbacks are in terms of spatial and temporal resolution. Certain objects such as building structures and road signs and crops are small wherein the size is far less than the IFOV of satellite sensors and therefore detection and monitoring could not be performed. As an example, crops such as paddy has very small leaves which contains important information to indicate the conditions of the plant however this is undetectable using satellites images [9]. Example GeoEye satellite images for paddy area in Kedah, located in the northwest of Peninsular Malaysia snipped from Google Maps are shown in Fig. 1. The left image is at a scale of approximately 1: 10,000 while the zoom-in image of the same spot is on right. Paddy leaves are still undetectable although after zooming-in the image to the highest level (right) [10].
In terms of temporal resolution, the frequency of satellite image acquisition depends very much on the satellite orbit and altitude. It ranges from once every 3 days to once every 16 days. The date and time of satellite overpass are fixed in which these satellites are operated by satellite operators in developed countries wherein users do not have any autonomy over them. The images are sometimes unavailable for certain places and time besides exposed to other crucial issues especially cloud and haze effects [7]. Here we display visibility data from Petaling Jaya located in Selangor, Malaysia, to demonstrate haze occurrence in Malaysia [11]. Fig. 2 shows a plot of daily visibility against day from 1999 to 2008. White, yellow, green, violet and red colours indicate clear (above 10 km visibility), moderate (5 -10 km visibility), hazy (2 -5 km visibility), very hazy (0.5 -2 km visibility) and extremely hazy (less than 0.5 km visibility) conditions respectively (Table II). For most years, a drop in visibility can be observed at the end of the year, indicating the occurrence of increased haze.
Beginning 2010s, UAVs have become affordable to many and the size becomes smaller. Nowadays, a standard UAV such as DJI Mavic Pro has the dimension of 83mm x 83mm x 198mm (height x width x length), weighted only 743 g and is already equipped with RGB camera. Such UAVs are being currently used in various applications related to surveying and planning, agriculture management, security and many more [12]. In term of spatial resolution, certain objects, or targets have dimension mm to cm in size. To meet this requirement, the spatial resolution of a standard UAV need can be varied by changing the flying altitude. However, in term of temporal resolution, a standard UAV has limited endurance or flying duration per battery and also the ability for the propeller motors to withstand the produced heat due to the frequent and robust flying behaviour. Also, a standard UAV can fly only in 20 to 30 minutes per battery. Other than that, a standard UAV has a fixed imaging system that cannot be changed or customised. Due to such situations, we have developed an improved version of UAV known as Personal Remote Sensing System (PRSS) aiming to address the issues of standard UAVs. With PRSS, image capturing can be done automatically based on the pre-set flying waypoints. Besides that, PRSS battery and motor usage also have been improved where flying can reach 40 to 50 minutes. The mounted imaging system also can be changed and customised accordingly based on users' needs. PRSS also aims to tackle the delay and cost issues related to traditional remote sensing data, improve the spatial and spectral resolution of the traditional remote sensing system and provide a user-friendly remote sensing system that can be used by anyone, at any place and in any time (Fig. 3). This system consists of a quadrotor UAV, a laptop as a processing unit and a smartphone for controlling and tracking. The quadrotor UAV is mounted with an RGB camera or any other imaging system to suit user's need. The UAV is equipped with GPS and telemetry facilities for tracking and controlling purposes. The camera is chosen to have GPS capabilities to provide a geographical location to the captured images. The acquired images are stored in the laptop and ready to undergo subsequent processing and analysis tasks in various applications. Fig. 4 shows the conceptual implementation of the PRSS.

III. NOISE AND RESTORATION
We can consider a noisy image to be modelled as follows [13]: y) is the degradation function, (x,y) is the noise term and g ( x,y) is the resulting noisy pixel. The objective of restoration is to obtain an estimate of the original image, f( x,y) . Following this, the model of image degradation and restoration can be illustrated in Fig. 5 [13]. The model indicates that if the model of the noise in an image can be estimated, then it is possible to figure out how the restoration of the image can be carried out. In this study, impulse noise is to be chosen for the purpose of image degradation. There are three types of impulse noise which are salt noise, pepper noise and salt and pepper Noise. In an 8-bit image, salt noise is added to an image by addition of random bright (with 255 pixel value) all over the image. On the other hand, pepper noise is added to an image by addition of random dark (with 0 pixel value) all over the image. While the combination of both, salt and pepper noise is added to an image by addition of both random bright (with 255 pixel value) and random dark (with 0 pixel value) all over the image [15]. In other words, salt and pepper noise is a type of impulse noise and indicated by random black and white dots that appear in an image. This type of noise appears in the image due to the sharp and sudden changes of pixel's brightness in an image. Consequently, an image that is affected by the salt and pepper noise contain obvious dark pixels in bright regions while bright pixels in dark regions [14]. Fig. 6 shows a test pattern image and the same image after salt and pepper noise is added together with their corresponding histogram [13]. In this study, the salt and pepper noise is to be simulated on a UAV image and a where its effects are to be assessed via PSNR and MSE. For the purpose of restoration, median filtering is to be used due to the ability to replace the grey level of each pixel by the median of the grey levels in a neighbourhood of the pixels [16]. Median filtering is an order statistics filtering process where it produces a restored image that is given by: where, ̂( ), the filtered image depends on the ordering of the pixel values of the noisy image ( ), in the filter window. For a higher density of salt and pepper noise, the neighbourhood that form the window of a median filter can be enlarged by increasing the size of the median filter to effectively remove the noise [17]. In real remote sensing applications, such noise tends to cause errors in subsequent tasks such as land cover classification, object and feature detection and land surveying [18], [19], [20], [21], [22], [23].
To measure the quality of a corrupted image with respect to the noise-free image, peak signal-to-noise ratio (PSNR) can be www.ijacsa.thesai.org used, in which PSNR is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. Due to the different dynamic ranges of images, PSNR is frequently expressed in terms of the logarithmic decibel scale. The PSNR of a noisy image can be expressed as: where, is the mean squared error of the noisy image, ( ) is the noise-free image, ( ) is the noisy image, m and n are the number of rows and columns of the image respectively. In the same way, the PSNR of a restored image can be expressed as: where, ̂ is the mean squared error of the restored image, ( ) is the noise-free image, ̂( ) is the restored image. These concepts will be adopted in understanding the effects of salt and pepper noise in a UAV image and formulating its restoration procedure.

IV. MATERIALS AND METHODS
In this study, the experiment site is the main campus of Universiti Teknikal Malaysia Melaka (UTeM) located in Melaka, Malaysia. The main data come from PRSS imagery while ancillary data come from Google Maps. Fig. 7 shows the location of the study site, the grand hall of Universiti Teknikal Malaysia Melaka (UTeM), observed from the map of Malaysia, and its close-up from Google Maps. The main imagery used in this study was captured using a Canon PowerShot S100 camera that is mounted on the PRSS in which the implementation of image acquisition has been illustrated in Fig. 4. The image acquisition date was 27 March 2016 and the time was 9.37 am. It was a sunny day, and the sky is clear. Mission Planner software was used to plan the waypoints for PRSS navigation. Before taking off, the PRSS needed to be calibrated by turning the PRSS based on the x, y and z-axis which representing "roll", "pitch" and "yaw" accordingly. This is to ensure the PRSS getting a good set of roll, pitch and yaw tuning parameters for stable and accurate flight navigation. Next, the PRSS need to be "armed" and "disarmed" several times to ensure remote controlling was working properly. PRSS was then launched and the navigation was set to be "automatic" so that the PRSS follows autonomously the pre-set waypoints. The flying and battery conditions were tracked and monitored closely using the Mission Planner software. At the same time, the capturing of images was automatically performed based on the pre-set timing of the camera allowing automatic image acquisition to be performed along the waypoints for 40 to 50 minutes. Upon completing the mission, the PRSS was landed using the "return to home" function. Soon after safely landed, the PRSS was "disarmed" to power off the PRSS and the four rotors and battery were taken off.
The acquired images that were saved in the camera's storage card were downloaded into the laptop and are ready to be processed and analysed. MATLAB software was used for the purpose of image processing and analysis. Initially, images that contain important landmarks were sorted and an image which contains the grand hall of UTeM was chosen due to having the criteria needed for this study. Fig. 8 shows the UTeM's grand hall captured from PRSS while the attribute and the corresponding metadata of the image are given in Table III. From this image, four regions sizing 100 rows by 100 columns were subsetted: (a) grand hall's roof surface, (b) landscape's vegetation, (c) road surface and (d) balcony's roof surface to represent the bright, dark, moderate dark and moderate bright condition respectively. This was to enable the effects of noise on different image condition to be investigated. The investigation was performed by simulating salt and pepper noise to these subsetted images. The process was carried out for noise density ranging from 0.1 to 0.9. The PSNR and MSE of the degraded images were determined. For image restoration, median filtering with different filter sizes was systematically applied. The size that can produce the restored image with the highest PSNR is chosen to be the optimal size for the particular noise density. This was based on the fact that the higher the PSNR, the higher the quality of the restored image and the lesser the remaining noise were left. The process was repeated for the rest of the images with other noise densities. The flowchart of the process is given in Fig. 9.     Fig. 8. It is obvious that these regions possess different brightness conditions due to the different spectral properties of the materials [24]. Graphs of MSE versus noise density were then plotted to investigate the relationship between them for red, green and blue channel and for each of the regions. Fig. 11 shows the MSE for (a) grand hall's roof surface, (b) landscape's vegetation, (c) road surface, and (d) balcony's roof surface. It can be seen that for all regions MSE increases as noise density increases. For grand hall's roof surface, the separation of the MSE for the red, green and blue channel is getting larger as noise density increases. Blue channel gives higher MSE compared to the green and blue channel for all noise densities. At 0.1 noise density, the MSE is approximately 2000 while at 0.9 noise density, the MSE ranging from 14000 to 20000 for all channels. A similar trend can be seen for road surface however with closer separation between the channels with 2000 and 16000 to 18000 MSE for 0.1 and 0.9 noise densities respectively. A different trend is shown for landscape's vegetation and balcony's roof surface where the curves are very close between each other with approximately 4000 to 25000 MSE and 2500 to 16000 MSE respectively at 0.1 and 0.9 noise density. Since MSE and PSNR are interrelated, graphs of PSNR versus noise density were then plotted to investigate the relationship between them for red, green and blue channel and for each of the regions.   12 shows PSNR versus noise density for (a) grand hall's roof surface, (b) landscape's vegetation, (c) road surface and (d) balcony's roof surface. The colour of the curves indicates the PSNR for the red, green and blue channel of the image for each of the regions. For grand hall's roof surface, it is clear that for all channels PSNR decreases as noise density increases. At 0.1 noise density, for all channels, PSNR is below 16 while at 0.9 noise density, PSNR is above 4. There is an obvious separation between the PSNR curves with PSNR for the red channel seems higher than the green and blue channel. For landscape's vegetation, the PSNR curves are close between each other. At 0.1 noise density, the PSNR for all channels is below 14 while at 0.9 noise density, the PSNR for all channels is approximately 4. For the road surface, the separation of the PSNR curves is less than the grand hall's roof surface. At 0.1 noise density, the PSNR for all channels is below 16 while at 0.9 noise density, the PSNR for all channels is approximately 6. For balcony's roof surface, the PSNR curves are very close between each other. At 0.1 noise density, the PSNR for all channels is approximately 16 while at 0.9 noise density, the PSNR for all channels is above 6.
Previously the outcomes of the analysis in terms of MSE and PSNR for noisy images have been presented. Next, the restoration of these noisy images using median filtering was performed. In doing so, graphs of PSNR versus filter size were plotted for each noise density, for each channel and for each region. This allows the optimal filter size to be identified systematically. Following this, the variation of the optimal filter size as the noise density increases is analysed for each channel and for each region. Fig. 13 shows PSNR versus filter size for grand hall's roof surface for selected noise density: (a) 0.1, (b) 0.2, (c) 0.3 and (d) 0.8 in red, green, blue and red respectively. The maximum PSNR is indicated by the highest peak marked with dotted red line and is taken to be the optimal filter size for the particular noise density and channel. Fig. 14 shows the optimal filter size for the grand hall's roof surface in (a) red, (b) green and (c) blue channel. For the red channel, the filter size varies from 3 to 15 with a gradual increase from 0.1 to 0.6 noise density while a rapid increase from 0.6 to 0.9 noise density. For the green channel, the filter size varies from 3 to 25 with a steady increase throughout the noise densities. For the blue channel, the trend is similar to the green channel. For the grand hall's roof surface, blue and green channel require filters with higher sizes compared to the red channel. This is due to the shorter wavelengths of the green and blue channels which experience a more significant degradation compared to the red channel with higher wavelengths, therefore require a higher filter size.  Fig. 15 shows the optimal filter size for landscape's vegetation in red, (b) green and (c) blue channel. For all channels, there seems to be a steady increase in filter size from 0.1 to 0.9 noise density ranging from 3 to 25. However, there is a sudden drop in filter size at 0.4 noise density for the green channel. For landscape's vegetation, the degradation is not likely to be affected by the different wavelengths (correspond to different channels). This may be due to the very dark spectral properties of the landscape's vegetation that can somewhat compensate the effects of the salt and pepper noise and therefore require about similar filter size trend for all channels. Fig. 16 shows the optimal filter size for road surface in (a) red, (b) green and (c) blue channel. For the red channel, the filter size varies from 3 to 19 with a slow increase in size from 0.1 to 0.6 noise density but a faster increase in size from 0.6 to 0.9 noise density. For the green channel, the filter size varies from 3 to 21 with a steady increase from 0.1 to 0.6 noise density but a more rapid increase from 0.6 to 0.9 noise density. Compared to the grand hall's roof and landscape's vegetation, here the filter size starts with 5 instead of 3, and it does not change until 0.4 noise density. This indicates that green channel is sensitive to low-density noise but the effects of the noise do not change much until the noise level is at about moderate level. The increase of filter size is at a constant rate from 0.6 to 0.8 noise density and the rate become much faster from 0.8 to 0.9 noise density in which indicating the effects of noise are more significant at moderate and much more significant at higher noise densities. For the blue channel, the filter size varies from 3 to 25 with a steady increase from 0.1 to 0.6 but a faster increase from 0.6 to 0.9. For the road surface, the shorter the wavelengths of the channels, the greater the effects of the salt and pepper noise, therefore the higher the filter size is required. This may be due to the moderately bright properties of the road surface that signifying the effects of salt and pepper noise. Fig . 17 shows the optimal filter size for balcony's roof surface in (a) red, (b) green and (c) blue channel. For the red channel, the filter size varies from 3 to 17 with a steady increase from 0.1 to 0.6 noise density but a more rapid increase from 0.6 to 0.9 noise density. For the green channel, the filter size varies from 3 to 21, with a gradual increase from 0.1 to 0.5 but a rapid increase from 0.5 to 0.9 noise density. A similar trend is shown by the blue channel. For balcony's roof surface, the green and blue channel that having shorter wavelengths is being more affected by the salt and pepper noise and therefore require a higher optimal filter size compared to red channel that possesses longer wavelengths.
So far, the outcomes of the quantitative analyses have been presented and discussed. For the purpose of qualitative analysis, selected samples of images before and after restoration were displayed side by side so that visual inspection of the restoration performance can be validated. We purposely chosen three of images with low noise density (≤ 0.5) and one with high density (> 0.5) so that the outcomes are worth showing. Fig. 18 shows the noisy and restored images on selected samples for: (a) grand hall's roof surface, (b) landscape's vegetation, (c) road surface and (d) balcony's roof surface with noise density 0.2, 0.3, 0.5 and 0.8 respectively. It is clear that for low noise densities in (a), (b) and (c), the restoration works well where almost all noises are successfully removed. For (d), it can be seen that most noises are removed however there seems to be loss of information, indicated by bright and dark patches, within the balcony's roof surface image. The qualitative analysis shows that the restoration works best for images with low-density compared to high-density salt and pepper noises. To examine the overall filter size variation trend for all regions and channels, the minimum, maximum and variation of the optimal filter sizes were tabulated in a single table. www.ijacsa.thesai.org   IV shows the minimum, maximum and variation for the filter size for the red, green and blue channel based on region A, B, C and D representing grand hall's roof surface, landscape's vegetation, road surfaceand balcony's roof surface respectively. R min , G min and B min are minimum filter sizes in red, green and blue channel respectively. R max , G max and B max are maximum filter sizes in red, green and blue channel respectively. R var. , G var. and B var. are the variation of filter sizes in red, green and blue channel respectively, avrg. is the average value of respective components while var. avrg. is the average of R var. , G var. and B var. .

5
It can be seen that the average value of R min and B min is 3 and is the smallest while the average of B max is 21.5 and is the largest. The variation between the average of B min and B max , B var. is 18.5 for which is the highest. This indicates the filter for blue channel easily changes as noise density changes. Thus the blue channel is the most sensitive to noise compared to the red and green channel. This is because the blue channel has a higher ability to capture noise effects compared to the red and green channel, hence noise in the blue channel has a higher visibility compared to the red and green channel. In term of average filter variation, landscape's vegetation has the highest variation of 22 signifying that the very dark spectral properties of landscape's vegetation are easily being influenced by the effects of noise, thus has the highest sensitivity to noise compared to brighter regions V. CONCLUSION In this study, we have experimented salt and pepper noise of different densities on the red, green and blue channel of a UAV image containing regions with different spectral properties. Image restoration has been performed using median filtering of different filter sizes. An optimal filter size has been chosen based on the highest PSNR of the restored image produced. The result shows that the effects of noise on a UAV image and the optimal size of a median filter for image restoration are dependent on the spectral properties of the channels and regions of interest. The blue channel is found to have the highest response to noise due to the shortest spectral wavelengths compared to the red and green channel, while landscape's vegetation is the most sensitive to noise compared to grand hall's roof surface, road surface and balcony's roof surface due to its very dark spectral properties that making it easily being influenced by the noise effects. For image www.ijacsa.thesai.org restoration, generally, optimal median filter size increases with different rate of variation as noise density increases. The restoration works best for images with low-density compared to high-density salt and pepper noises. The filter size for blue channel varies with the biggest variation and is the largest for the highest noise density due to the higher response to noise effects compared to the red and green channel. Darker regions require larger filter sizes compared to brighter regions as noise density increases due to the higher sensitivity to the presence of noise.