Maximum Likelihood Classification based on Classified Result of Boundary Mixed Pixels for High Spatial Resolution of Satellite Images

Maximum Likelihood Classification: MLC based on classified result of boundary Mixed Pixels (Mixel) for high spatial resolution of remote sensing satellite images is proposed and evaluated with Landsat Thematic Mapper: TM images. Optimum threshold indicates different results for TM and Multi Spectral Scanner: MSS data. This may since the TM spatial resolution is 2.7 times finer than MSS, and consequently, TM imagery has more spectral variability for a class. The increase of the spectral heterogeneity in a class and the higher number of channels being used in the classification process may play significant role. For example, the optimum threshold for classifying an agricultural scene using MSS data is about 2.5 standard deviations, while that for TM corresponds to more than four standard deviations. This paper compares the optimum threshold between MSS and TM and suggests a method of using unassigned boundary pixels to determine the optimum threshold. Further, it describes the relationship of the optimum threshold to the class variance with a full illustration of TM data. The experimental conclusions suggest to the user some systematic methods for obtaining an optimal classification with MLC. Keywords—Maximum likelihood classification; optimum threshold; Landsat TM; MSS; Mixed Pixel; spatial resolution


I. INTRODUCTION
Now a day, high spatial resolution of remote sensing satellite imagery data is available. The finest resolution as of now is 30cm provided by very high-resolution commercial satellites. For instance, 50cm high-resolution image from Pleiades satellite (Airbus); it allows to clearly see the buildings, small boats, narrow streets. There is another example, a 50cm resolution image from SuperView-1 and 40cm (Kompsat-3A), etc.
One of problems of utilization of these high spatial resolution of satellite images is comparatively poor classification performance. In general, variance of a designated class in a feature space is increased in accordance with the spatial resolution which result in poor classification performance. Another disadvantage of the high spatial resolution of satellite image classification is determination of optimum threshold for the discrimination between classes in the well-known Maximum Likelihood classification or some other classification methods such as Support Vector Machine, Deep Learning Based Method, etc. This paper is intended to solve the latter problem.
Supervised maximum likelihood classification based on multidimensional normal distribution for each pixel is widely used as a classification method for remote sensing data. In this method, the training area of each class of interest is first set by the analyst, and each class is defined by the distribution of the classes in the probability space from the samples in the area. Then, using the likelihood between the distribution and the pixel to be classified as an index, the pixel is classified into a distribution class showing the highest likelihood.
However, if the likelihood is not so large, it is determined that it is better not to be classified into some forcibly set classes, so that the pixel is set as an unset class. At this time, the value of the likelihood for deciding whether to make the class unset is defined as a threshold here. In general, the threshold is determined empirically by repeated trial and error using the classified image according to the subjectivity of the analyst. Therefore, it takes a lot of time to find the optimal threshold and lacks general bias.
The optimal threshold decision method proposed in this paper extracts boundary pixels in different class regions, performs maximum likelihood classification with a certain threshold on those pixels, and determines the number of pixels classified into unset classes It is optimized with a threshold value that is half the number of pixels. In this method, the target region to be classified for obtaining the optimum threshold is limited to the boundary pixels, and therefore the optimum threshold can be determined objectively in a short time.
This paper first describes related research works followed by the definition of the optimal threshold together with research background. Then, the optimal thresholds of Landsat TM and MSS data with different Instantaneous Field of View: IFOV are subjectively obtained by the conventional method, and the two elderly are compared to clarify the relationship between the optimal threshold and the IFOV. Finally, the proposed method is explained, and it is shown using TM data that it matches the optimal threshold value obtained by the conventional method, and the validity of the proposed method is shown.

II. RELATED RESEARCH WORKS
Classification by re-estimating statistical parameters based on auto-regressive model is proposed for purification of 25 | P a g e www.ijacsa.thesai.org training samples [1]. Meanwhile, multi-temporal texture analysis in TM classification is proposed for high spatial resolution of optical sensor images [2]. On the other hand, Maximum Likelihood (MLH) TM classification considering pixel-to-pixel correlation is proposed [3].
Supervised TM classification with a purification of training samples is proposed [4] together with TM classification using local spectral variability is proposed [5]. A classification method with spatial spectral variability is also proposed [6] together with TM classification using local spectral variability [7].
Application of inversion theory for image analysis and classification is proposed [8]. Meanwhile, polarimetric Synthetic Aperture Radar: SAR image classification with maximum curvature of the trajectory in eigen space domain on the polarization signature is proposed [9]. On the other hand, A hybrid supervised classification method for multidimensional images using color and textural features is proposed [10].
Polarimetric SAR image classification with high frequency component derived from wavelet multi resolution analysis: MRA is proposed [11]. Comparative study of polarimetric SAR classification methods including proposed method with maximum curvature of trajectory of backscattering cross section in ellipticity and orientation angle space is conducted and well reported [12].
Human gait gender classification using 2D discrete wavelet transforms energy is proposed [13] together with human gait gender classification in spatial and temporal reasoning [14].
Comparative study on discrimination methods for identifying dangerous red tide species based on wavelet utilized classification methods is conducted [15]. On the other hand, multi spectral image classification method with selection of independent spectral features through correlation analysis is proposed [16].
Image retrieval and classification method based on Euclidian distance between normalized features including wavelet descriptor is proposed [17]. Meanwhile, gender classification method based on gait energy motion derived from silhouettes through wavelet analysis of human gait moving pictures is proposed [18] together with human gait skeleton model acquired with single side video camera and its application and implementation for gender classification [19].
Gender classification method based on gait energy motion derived from silhouette through wavelet analysis of human gait moving pictures is proposed [20] together with human gait gender classification using 3D discrete wavelet transformation feature extraction [21].
Wavelet Multi-Resolution Analysis: MRA and its application to polarimetric SAR classification is proposed [22]. On the other hand, object classification using a deep convolutional neural network and its application to myoelectric hand control is proposed and evaluated effectiveness [23]. On the other hand, image classification considering probability density function based on Simplified beta distribution is proposed and validated with remote sensing satellite imagery data [24].

A. Definition of Optimum Thershold for Maximum Likelihood Classification
The log likelihood function gi (x) of the observation vector x with respect to the class i in the maximum likelihood classification [25] can be expressed by Eq. (1).
where, μi and Σi are the mean vector and the covariance matrix of class i, det. Is the determinant, and t and −1 are the transpose and inverse matrix, respectively. x is classified into a class i in which gi (x) is maximum. At this time, if gi (x) does not exceed the threshold value θ, x is classified into an unset class. No, usually θ is defined by likelihood.
Suppose now that the classes i and j and other unset classes are considered as shown in Fig. 1. In the figure, Pu1 to Pu are the probabilities of being classified into the unset class, Pci Pcj is the probability of taking pixels that originally belong to the set class into classes i and j, and Pci, Pcj is the pixel that originally belongs to class j. The probability of classifying pixels of class i into class j. Pi and Pj are the correct answer rates for classifying pixels belonging to classes i and j into classes i and j, respectively. Here, the correct answer rate Pc, the misclassification probability Pe, and the classification into unset classes, that is, the probability Pu not classified because the confidence level of the classification result is low are defined by the following equations.

Pc=Pi+Pj
(2) When the threshold value θ is increased, Pc and Pe decrease, and Pu increases. The amount of the increase or decrease depends on the probability density function of the classes i and j and the distance between classes. The threshold www.ijacsa.thesai.org that maximizes (aPc-bPecPu) is defined as the optimal threshold θopt. Here, a, b, and c are weighting factors for each probability, and are determined based on the subjectivity of the analyst. Therefore, according to the subjective opinion of the analyst, θopt is obtained by repeating trial and error using the classification result with several threshold values (this is called the conventional method here).
B. Example of Optimum Thershold 1) Analysis image: Acquired by Landsat 5 TM on August 15, 1984. Using the data of Cranbrook, British Columbia, Canada (Path: 43, Row: 25), the optimal threshold was determined by the conventional method [26]. This area is mostly covered with forests, including logging areas, alpine meadows, roads, rivers, etc. The following four classes were set from these. i) New logging area, ii) Old logging area, iii) Alpine grassland, iv) Forest where, a new felling area was defined as a forest that was harvested within 5 years, an old logging area was harvested between 5 and 40 years, and an area with a tree age of more than 40 years was defined as a forest.
2) Optimal threshold: Fig. 2(a) and (b) show examples of classified images when a part of the original image and the threshold θ are changed to −40, −30, and −20, respectively. In the case of θ = −20, the probability of being classified into an unset class is high, and it is difficult to say that the threshold is optimal. Looking at the classified image of b = -40, the shaded forest on the upper left and the exposed part of the rock on the left are misclassified into the class of "new logging area" with similar spectral characteristics. On the other hand, in the classified image of θ = −30, those portions are classified into the unset class, and it turns out that the threshold value is the optimal threshold after all.

C. Resolution and Optimal Threshold
The optimal threshold for MSS data is said to be the likelihood corresponding to two to three times the standard deviation of the band showing the maximum variance. What is the optimal threshold for TM data with improved quantization bit rate and instantaneous visual field? With improved resolution, the intra-class variance increases [27] and the optimal threshold generally decreases. To illustrate this, the optimal thresholds were compared using both TM and MSS data acquired simultaneously by Landsat 5 on July 29, 1984. The target area is an agricultural area in Melfort, Saskatchewan, Canada.  27 | P a g e www.ijacsa.thesai.org The following classes were set from the main components.

1) Urban area, 2) Barley field, 3) Wheat field, 4) Bean field, 5) Fallow field
Comparing the logarithm of det.Σ (generalized variance), which indicates the degree of variance of each class, with TM and MSS data, TM is larger as shown in Table I. In addition, using the average vector of each class and the covariance matrix, the average of the Mahalanobis distance 1 with the observation vector in the training area is obtained, and when comparing TM and MSS, it is found that TM is smaller as shown in the table. That is, it is understood that the likelihood is large.
The maximum likelihood method with various thresholds was applied to both data, and the optimal threshold was determined by the conventional method of subjectively evaluating the classified images. As a result, it was found that the optimum value is about 18 for MSS and about 30 for TM. Examining this in Fig. 3 showing the relationship between the threshold and the classification accuracy Pc of each class, the threshold at which all Pc is almost saturated is optimal.  However, the number of TM bands in this comparison is 6, and that of MSS is different from 4. Since the likelihood comparison cannot be performed without the dimensions of the variables, the TM dimension is further reduced to 4. investigated. As a method of dimensionality reduction, the method using only the degree of separation as an index is considered, and selected TM bands 1, 3, 4, and 5. As a result, as shown in Fig. 4, the optimum threshold value for 4-Pand TM data is about -22, which is lower than that of MSS. This difference is thought to be due to the difference of class variance in feature space based on the difference of spatial resolution.
IV. PROPOSED METHOD (MAXIMUM LIKELIHOOD CLASSIFICATION WITH CLASSIFIED BOUNDARY PIXEL Looking at the classified image at the optimal threshold value obtained by the analyst's subjective view, and Fig. 2(b), about half of the boundary pixels of different setting classes are classified as unset classes. These boundary pixels are also called Mixels (Mixed Pixels) and have at least two or more types of spectral classes having complex spectral characteristics weighted by the area ratio of pixels in each class. Focusing on a certain class i, consider a boundary pixel set in contact with this class area. To simplify the discussion, consider a one-dimensional pixel array, and consider the classification of boundary pixels at the sampling phase of ln-Phase and Out of Phase shown in Fig. 5.
Since the sampling phase is uncorrelated between the radiometer scanning time and the boundary position of the surface object, it is rare in the case of In-Phase, almost out of phase, and independent of the position within the boundary pixel. It is a uniform probability. Now, when the likelihood is classified based on 50%, the probability of being classified into an unset class is 0, and the boundary pixels are classified into either i or j class. 28 | P a g e www.ijacsa.thesai.org If the likelihood is classified based on 100%, all boundary pixels are classified into an unset class. The criterion of the likelihood that half of the boundary pixels are classified into the unset class is 75% shown by the broken line and the dashed line in Fig. 5(b). The proposed scheme is based on this. Although the discussion so far is valid for one class, finding the optimal threshold common to all classes is more complicated and difficult to derive theoretically. Therefore, the proposed method considers a state where the probability of being classified into the unset class of the boundary pixel set is 0.5 as the first approximation of the optimal threshold. Based on this assumption, using the extracted boundary pixel classification results, the threshold value such that half of them are classified into the unset class was considered optimal. Fig. 6 shows a flowchart of the optimum threshold value determination method. In the figure, the mask pattern is a binary image in which the boundary pixel is "1" and the other pixels are "0".

V. EXPERIMENT
Numerous methods for detecting boundary pixels have been proposed depending on the application and the definition of the boundary [28]. In this paper, a method [29] that uses the variance of local spectral information emphasized in highresolution images as an index in used. Specifically, the boundary pixel was extracted by binarizing the variation coefficient of the pixel value in a 2 × 2 pixels cell by a predetermined threshold.
Next, an example of extraction using TM data used in the analysis is shown in 22. Table II shows the average, variance, and coefficient of variation of each class for Band 4 (TM-4), which has the largest variance of the set class. The minimum value of the difference in the coefficient of variation between the classes is 0.176. If this value is set as a threshold and the image representing the variation coefficient is binarized, the boundary pixels between the classes can be extracted. To confirm this, boundary pixels were extracted by changing the threshold to 0.15, 0.2, 0, 3. As a result, as shown in Fig. 7, the threshold is best when the threshold is 0.15. Confirmed that it represents the world.
A maximum likelihood classification with a threshold of -10 to -40 was applied to the above extracted boundary pixels to obtain a classified image. After that, the ratio of the pixels classified into the unset class in the boundary pixels was calculated, and the relationship with the threshold was examined.
As a result, as shown in Fig. 8, the threshold value corresponding to 50% of the pixels classified into the unset class among the boundary pixels is about 130, and this value is 2.2, which is subjectively calculated by the conventional method. The validity of the proposed method was evident from the fact that it almost coincided with the above.     VI. CONCLUSION When applying the maximum likelihood classification to high-resolution images, the following points need to be considered to determine the optimal threshold. That is, the high-resolution image has a relatively large variation in the spectral information, so that the threshold value needs to be selected low. In addition, the optimal threshold decision method proposed in this paper, which has a threshold value at which the occupation probability of a pixel classified into an unset class at the boundary pixel becomes 0.5, has the following characteristics.
1) According to the conventional method, it is determined by the subjectivity of the analyst, and it is difficult to compare the classification results. However, according to the present method, it can be obtained objectively.
2) In the conventional method, several maximum likelihood methods with different thresholds are applied to the entire analysis image and determined by trial and error. However, according to this method, classification is performed only on extracted boundary pixels. Since the optimum threshold value is obtained, the processing time is short.
3) This method is based on a completely new concept of determining the optimal threshold of the maximum likelihood method using the probability of being classified into an unset class of boundary pixels.

VII. FUTURE RESEARCH WORKS
The proposed method must be validated with a variety of high spatial resolution of optical sensors onboard satellites.