Hyperspectral Image Classification Using Unsupervised Algorithms

Hyperspectral Imaging (HSI) is a process that results in collected and processed information of the electromagnetic spectrum by a specific sensor device. It’s data provide a wealth of information. This data can be used to address a variety of problems in a number of applications. Hyperspectral Imaging classification assorts all pixels in a digital image into groups. In this paper, unsupervised hyperspectral image classification algorithms used to obtain a classified hyperspectral image. Iterative Self-Organizing Data Analysis Technique Algorithm (ISODATA) algorithm and K-Means algorithm are used. Applying two algorithms on Washington DC hyperspectral image, USA, using ENVI tool. In this paper, the performance was evaluated on the base of the accuracy assessment of the process after applying Principle Component Analysis (PCA) and KMeans or ISODATA algorithm. It is found that, ISODATA algorithm is more accurate than K-Means algorithm. Since The overall accuracy of classification process using K-Means algorithm is 78.3398% and The overall accuracy of classification process using ISODATA algorithm is 81.7696%. Also the processing time increased when the number of iterations increased to get the classified image. Keywords—hyperspectral imaging; unsupervised classification; K-Means algorithm; ISODATA algorithm; ENVI


INTRODUCTION
Remote sensing is the art and science to obtain information about an object, area.It is viewed as the measurement and analysis of electromagnetic radiation transmitted through, reflected from, or absorbed and dissipated by the ambiance, the hydrosphere and by material at or near the land surface, for the purpose of interpreting and managing the Earth's resources and surroundings.Optical remote sensing makes use of visible, near infrared and short-wave infrared sensors to make pictures of the earth's surface by observing the solar radiation reflected from targets on the background as indicated in Fig. 1.Different materials reflect and absorb differently at different wavelengths.Thus, the targets can be differentiated by their spectral reflectance signatures in the remotely sensed images [1][2] [3].
Hyperspectral sensors such as the Airborne Imaging Spectro-radiometer for applications (AISA) enabled the construction of an effective, continuous reflectance spectrum for every pixel in the scene.These schemes can be applied to discriminate among earth surface features [1][2] [4].
By analogy to grasp the hyperspectral imaging concept better is that the human can sight visible light in the main three (RGB) bands, i.e. red, green, and blue, whereas spectral imaging divides the spectrum into many more bands which are infrared bands, RGB bands, and ultraviolet bands (see Fig. 2) [5][6].
The "hyper" in the word "hyperspectral" refer to "too many" as in "over" and indicate the massive number of wavelength bands.Hyperspectral imaging is spectrally specified, which indicates that it provides vast spectral information to distinguish and identify spectrally singular materials.Hyperspectral imaging supplies the possibility of further precise and exhaustive information extraction than potential with any other kind of remote sensing data [7].It owns an increased capability that enhances the chance of detecting concerning materials and supplies more information needful for recognizing and classifying these materials [8] [9].The HSI pixels form spectral vectors demonstrate the spectral characteristics of these materials in the sight [10].
There are some limitations in hyperspectral images are images distortion that are resulting from the spherical of the earth, giving inaccuracies in the properties such as directions, distances, and scale.These distortions of images can be shadows; such as the shadow covers a specific area to be studied, and the brightness of the light on a specific area [11].The size of the pixel where it is possible to be relatively large so that a pixel can contain a lot of properties and that is difficult to classify, or be very small in terms of not contains characteristics can be classified is one of the hyperspectral imaging limits [12].The main purpose of classification of satellite imagery is to assess landscape properties accurately and extract required information [13].Unsupervised and supervised classification algorithms are the two prime types of classification.Unsupervised classification is shown in Fig. 3 [14].The classification chain is unsupervised, where the classification algorithms used are K-Means algorithm and ISODATA.

A. K-Means Classifier
The K-means algorithm is a straightforward process for deriving the mean of a group of K-sets.The purpose of the K-Means algorithm is to reduce the cluster variability (see Fig. 4) [15][16] [17].The process of the K-means algorithm is described in the following pseudo-code [18].

B. Iterative Self-Organizing Data Analysis Technique Algorithm (ISODATA)
The ISODATA algorithm is one of the most utilized methods in the unsupervised classification (see Fig. 5).In more particular, the steps in ISODATA clustering are as follows [19]  The unsupervised classification was applied on a hyperspectral image using ENVI tool.The hyperspectral dataset, which has been applied to, is an image of Washington DC.The two steps that applied to the hyperspectral image are Principle Component Analysis (PCA) and K-Means or ISODATA algorithms.The result of applying K-Means algorithm and ISODATA algorithm is a classified image.Process time increased when the number of iterations increased to get the classified image.In this work, statistical information calculated from the classified image data and seen that K-Means algorithm and ISODATA algorithm are accurate since each pixel in the image is classified into a class that is not Unclassified Class.ISODATA algorithm is more accurate than K-Means algorithm since that overall accuracy of classification process using ISODATA algorithm is 81.7696% and the overall accuracy of classification process using K-Means algorithm is 78.3398%.

A. ENVI (Environment of Visualizing Images)
ENVI is an image processing system.It was designed to process remotely sensed data.It provides comprehensive data visualization and analysis for images.It has the ability of treating a broad set of scientific data formats [22] [23].

B. Cases Studies
 Comparison between the results of applying different RGB bands on the same hyperspectral image.
 Studying the effects of changing the number of iterations in the process of classification on the accuracy of classification.
 Comparison between the result of applying K-Means algorithm in the first time with the result of applying the ISODATA algorithm at the second time.
 Comparison between the result of applying the PCA algorithm then applying K-Means algorithm in the first time with the result of applying only K-Means algorithm in the second time.

1) Case Study 1:
It is using different RGB bands (see Table I) to the same hyperspectral image (Washington DC), and applying Principle Component Analysis (PCA) and Kmeans.The results of applying PCA on the image using Test values in Table I are shown in Fig. 6.The number of classes is 6 and the number of iterations is 3, where Fig. 7 shows the results of applying K-Means algorithm on the output images of PCA. 2) Case Study 2: It is seen that when the number of iterations to classify a hyperspectral image is increased, the overall accuracy will increase while the overall accuracy will decrease when the number of iterations is decreased.This is approved through the above experiments.Table II is a summary of the results of the experiments.
3) Case Study 3: It is about using different algorithms on the same hyperspectral image (Washington DC), these algorithms are K-means and Iterative Self-Organizing Data Analysis Technique Algorithm (ISODATA).In both case as shown in Fig. 8, it is applied as a first step PCA then either K-Means algorithm or ISODATA as a second step.The selected bands are PC Band 172 for R, PC Band 86 for G, and PC Band 24 for B. Number of classes is 6 classes and the maximum iterations is 3 are chosen as K-Means parameters.Fig. 8-a shows the result of applying K-Means algorithm.Then, the number of classes range is from 4 to 6 and the maximum iterations is 3 are chosen as ISODATA parameters.

C. Class Statistics 1) Calculating Class statistics based on applying the Kmeans algorithm:
Calculating statistics based on applying Kmeans algorithm on Washington DC image results, Fig. 10-a shows the Means for all classes and Fig. 10-b shows the standard deviation for all classes, those represent the relation between the band number and the value.Fig. 10-c shows the basic statistics include minimum, maximum and mean for each band for Tree class.Fig. 10-d shows the standard deviation for Tree class.Table III shows the class distribution summary and Table IV indicates confusion matrices using the ground truth image.A total class error is indicated in Fig. 11.
2) Calculating Class Statistics based on applying ISODATA algorithm: Calculate statistics based on applying the ISODATA algorithm on Washington DC hyperspectural image results, Fig. 12-a shows the Means for all classes and Fig. 12-b shows the Stdev for all classes, those represent the relation between the band number and the value.Fig. 12-c shows the basic statistics include minimum, maximum and mean for each band for Tree class.Fig. 12-d shows the standard deviation for Tree class.Table V shows the class distribution summary and Table VI indicates confusion www.ijacsa.thesai.orgmatrices using the ground truth image.A total class error is indicated in Fig. 13.V. FUTURE WORK Further studies should be conducted in studying the underground water, and soil in different regions in the Arab region for starting new wheat pivots.Also, the regions are not suitable for growing wheat in order to save money and effort if it gets infected.

Fig. 8 -
b shows the result of applying ISODATA.4) Case Study 4: It It is about studying the effect of implementing PCA on classification result, as shown in Fig. 9a, comparing to implementing just K-Means algorithm without applying PCA as shown in Fig. 9-b.This study is implemented on the hyperspectral image (Washington DC).

Fig. 11 .
Fig. 11.Total Class Error (K-means)IV.CONCLUSIONHyperspectral images have broad spectral information to identify and distinguish materials spectrally unique.Classification of hyperspectral image means assigning objects with the same level of a class with homogeneous characteristics.In this work, unsupervised classification algorithms (K-Means algorithm and ISODATA algorithm) are used after applying Principle Component Analysis (PCA) using ENVI tool.PCA is used before the classification process as a technique in data analysis to reduce hyperspectral image dimensions.They are applied in a test site representative in the study area in Washington DC, USA.The overall accuracy was reported as 78.3398% for the K-Means classification approach, and 81.7696% for the ISODATA classification approach.It is found that K-Means algorithm and ISODATA algorithm give accurate results, but ISODATA algorithm has a better result on the study area image.
Fig. 12. Class statistics based on applying ISODATA algorithm

TABLE I .
CASE STUDIES USING DIFFERENT RGB BANDS

TABLE II .
SUMMARY OF THE EXPERIMENTS OF CASE STUDY 2

TABLE IV
. CONFUSION MATRIX USING K-MEANS ALGORITHM

TABLE V .
CLASS DISTRIBUTION SUMMARY (ISODATA ALGORITHM)

TABLE VI .
CONFUSION MATRIX USING ISODATA ALGORITHM