Optimum Spatial Resolution of Satellite-based Optical Sensors for Maximizing Classification Performance

Optimum spatial resolution of satellite based optical sensors for maximizing classification performance is investigated. Also, classification performance assessment method considering spatial resolution of satellite based optical imagers is proposed. Optimum spatial resolution which makes the highest classification accuracy is determined from spatial frequency components, spectral features of objects and classification method. First, in this paper, based on the relationship between variance of pixels and classification accuracy, classification accuracy for Landsat Multiple Spectral Scanner: MSS images with various Instantaneous Field of View (IFOV) will be shown. In their connection, variance of pixel values for images with various IFOV will be clarified. Second, assuming the shape of boundary line between adjacent categories is circle, relationship among IFOV, ratio of Mixels and classification accuracy will be cleared under the supposition that the number of Mixels equals to that of misclassified pixels. Finally, it will be also shown that aforementioned relationships and optimum spatial resolution have been confirmed by using airborne based MSS data of Sayama district in Japan. Keywords—Spectral information; spatial information; maximum likelihood decision rule; satellite image; image classification; mixed pixel (Mixels); optimum spatial resolution; classification performance; spatial and spectral features


I. INTRODUCTION
Optimum spatial resolution of satellite based optical sensors for maximizing classification performance is investigated. From the point of view of classification performance, there must exists an optimum spatial resolution of the spaceborne onboard optical sensors because the variances of the class categories are getting large in accordance with the spatial resolution. The classification performance is getting down because the overlapped areas among the class categories are getting large which results in confusion probabilities are getting large in accordance with spatial resolution.
Since variance of pixels correspond to that in the feature space increases in accordance with improvement of spatial resolution, classification accuracy will be gotten worse in accordance with improvement of spatial resolution under the limitations of variety of objects and class categories. On the other hand, classification accuracy gets better in accordance with improvement of spatial resolution because of decreasing of a ratio of "Mixels" which are pixels composing with plural class categories. Since aforementioned two effects contribute to classification accuracy multiplicatively, it seems that there exists an optimum spatial resolution.
The Instantaneous Field of View: IFOV of the multispectral radiometer mounted on the earth observation satellite with the highest classification accuracy, that is, the optimal spatial resolution, is generally determined by the spatial frequency component, spectral characteristics, classification method, and the like of the observation target [1]- [9]. Classification accuracy is defined as the discrimination efficiency (diagonal element of the confusion matrix) in maximum likelihood classification, and when parameters such as the object to be observed and the number of classes are limited, increasing the spatial resolution generally increases the variance in the feature space [10].
The classification accuracy becomes worse. On the other hand, the ratio of Mixels (mixed pixels) of different classes becomes smaller as the spatial resolution is improved, so that the classification accuracy is improved [11], [12]. Since the above effects synergistically contribute to the classification accuracy, it is considered that there is an optimal spatial resolution.
The motivation of this research study is to clarify relations between spatial resolution of optical sensors onboard satellites and classification performance and then find out optimum spatial resolution for maximizing classification performance. This paper first clarifies the relationship between the instantaneous visual field and the classification accuracy [13] by deriving the case variance based on the relationship between the variance of pixel values and the classification accuracy. Next, assuming that the shape of the boundary of the same class region is an arc, the relationship between the instantaneous visual field and the ratio of the Mixels is obtained, and further, the relationship between the instantaneous visual field and the classification accuracy is obtained assuming that the ratio of the Mixels is a false recognition rate. Both relationships show that there is an optimal spatial resolution, and this is confirmed by using MSS data of the aircraft that observed Sayama Hills, in Japan.
In the following section, related research works and research background including motivation of the research are (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 2, 2021 364 | P a g e www.ijacsa.thesai.org described. Then, the proposed context classification method is described followed by experimental method together with experimental results. After that, concluding remarks and some discussions are described.

II. RELATED RESEARCH WORKS
Classification by re-estimating statistical parameters based on auto-regressive model is proposed for purification of training samples. [14]. Meanwhile, multi-temporal texture analysis in Landsat Thematic Mapper: TM classification is proposed for high spatial resolution of optical sensor images [15]. On the other hand, Maximum Likelihood (MLH) TM classification taking into account pixel-to-pixel correlation is proposed [16].
Supervised TM classification with a purification of training samples is proposed [17] together with TM classification using local spectral variability is proposed [18]. A classification method with spatial spectral variability is also proposed [19] together with TM classification using local spectral variability [20].
Application of inversion theory for image analysis and classification is proposed [21]. Meanwhile, polarimetric SAR image classification with maximum curvature of the trajectory in eigen space domain on the polarization signature is proposed [22]. On the other hand, A hybrid supervised classification method for multi-dimensional images using color and textural features is proposed [23].
Polarimetric SAR image classification with high frequency component derived from wavelet multi resolution analysis: MRA is proposed [24]. Comparative study of polarimetric SAR classification methods including proposed method with maximum curvature of trajectory of backscattering cross section in ellipticity and orientation angle space is conducted and well reported [25].
Comparative study on discrimination methods for identifying dangerous red tide species based on wavelet utilized classification methods is conducted [26]. On the other hand, multi spectral image classification method with selection of independent spectral features through correlation analysis is proposed [27]. Image retrieval and classification method based on Euclidian distance between normalized features including wavelet descriptor is proposed [28].
Gender classification method based on gait energy motion derived from silhouettes through wavelet analysis of human gait moving pictures is proposed [29]. Also, human gait skeleton model acquired with single side video camera and its application and implementation for gender classification is proposed [30] together with human gait skeleton model acquired with single side video camera and its application and implementation for gender classification [31]. On the other hand, gender classification method based on gait energy motion derived from silhouette through wavelet analysis of human gait moving pictures is proposed [32] together with human gait gender classification using 3D discrete wavelet transformation feature extraction [33].
Image classification considering probability density function based on simplified beta distribution is proposed [34]. Meanwhile, Maximum Likelihood Classification: MLH based on classified result of boundary mixed pixels for high spatial resolution of satellite images is proposed [35]. On the other hand, context classification based on mixing ratio estimation by means of inversion theory is proposed [36].

A. Relationship between Instantaneous Visual Field and
Variance of Pixel Values According to Friedman et al. [13], the classification accuracy P is determined by the variance σ 2 of the pixel values and the size α in the feature space of each class, which is determined by the classification method. (1) where, α is defined as a range of a discrimination threshold in an arbitrary dimension of an arbitrary class (this threshold is determined by a classification method that regards the range of this threshold as the same class). The unit is the digital count value, which is the same as the unit of the standard deviation (σ) of the pixel value. σ 2 changes depending on the instantaneous field of view. If the time series x (t) of the original pixel values is now sampled at the time interval l and the obtained series is g (ξ), σ 2 (a) is shown in Appendix 1. Equation (2) can be expressed as follows.
That is, it can be seen that σ 2 (a) can be represented by only the even-order terms of a, and can be represented by an equation that is negative from the second term. Therefore, when x (t) is a continuous function, it can be seen that σ 2 (a) becomes a constant value when a approaches o, and decreases at first as an objective line as it increases. Since σ 2 (a) in the range of the instantaneous visual field near the optimal spatial resolution is considered to decrease parabolically from a constant value, here, Eq. (2) is approximated by the second term.

B. Classification Method in Concern
According to Crapper, the Mixel ratio F is expressed by the following equation.
where ∑ √ ∑ √ , L j denotes the perimeter of the class j area, A j ; the area of the class j area, a; the instantaneous visual field (pixel size), L; the average length when the boundary of the class area that crosses the pixel is approximated by a straight line. The parameters other than a are obtained by the Crapper [11] as experimental values using 1605 types of sample data as follows. In the process of deriving Eq. (3), Crapper makes the following assumptions.

1)
There are at most two classes in a pixel.
2) The length of the line segment that intersects the edge of the pixel with the boundary between different classes in the image space is not different from the line segment length when it is approximated by a straight-line.
Both of the above assumptions hold if a is sufficiently small compared to the size of the class area. However, when discussing the optimal spatial resolution, it is generally considered that the size of the class area is equivalent to the pixel size. Must be made and this assumption does not hold. In particular, the assumption of (2) often does not hold. Here, it is assumed instead that "the category boundary line in a pixel is an arc." Under this assumption, the ratio q between the average length L ij of the class area boundary line and the average length L obtained by linearly approximating it is the radius r and the pixel size a when the boundary line of the class area is assumed to be circular. It becomes a function and the relationship shown in Fig.2. Considering this ratio and adopting the experimental values of Crapper as other parameters, F can be expressed by the following equation.
Assuming that the classification accuracy of Mixels is 0, the classification accuracy of pixels (pure pixels) composed of a single class is 1, and their combined classification accuracy is equal to (1-F).

C. Overall Classification Accuracy
The total classification accuracy Pt is defined by the following equation in consideration of the influence of the variance of pixel values and the ratio of the Mixels on the classification accuracy.
The pixel size at which this Pt is the highest is defined as the optimal spatial resolution.

A. Geographical Characteristics of the Intensive Study Area
The analysis area is around Sayama as shown in Fig.3. The south side consists of Sayama hills at about 150m above sea level, and the north side consists of flat land about 100m above sea level. The Sayama hill has Sayama lake, broadleaf / coniferous forest, grassland, etc. The flat land consists of development land including bare land, urban area, residential area, paddy field, upland field, tea plantation, etc.   Table l shows the discrimination efficiency Pi for each class by maximum likelihood classification using the data of each instantaneous visual field. Pi; i is the class type, the area ratio of each class.

B. Generation and Classification of MSS Data of Instantaneous Field of View: IFOV
The discrimination efficiency Pa at each instantaneous visual field represented by the formula is shown.
Ri is the average of the area ratio of each class obtained by classifying the simulation images of each instantaneous visual field by the maximum likelihood method. From the table, it can be seen that the instantaneous visual field showing the maximum discrimination efficiency differs for each class, and that coniferous forests, fields, development areas, urban areas, etc. have a relatively large number of composite pixels of different classes, and the dimensions of the components are small. www.ijacsa.thesai.org (a) Topographic Map of the Lake Sayama.
(b) Sentinel-2 of the Lake Sayama.  As shown in Fig. 5, it can be seen that the variance of the pixel values is relatively low in the classification accuracy. It was also found that the average classification accuracy in each instantaneous visual field, weighted by the area ratio of each class, was highest when the instantaneous visual field was 30m.

C. Relationship between IFOV and Variance
As shown in Fig. 6, the variance differs depending on the channel and class; where, in order to clarify the average relationship between the IFOV and the variance, the function was approximated by the least square method.
As a result, Eq. (7) was obtained. The approximation error normalized by the variance value at this time was 0.71.
By substituting Eq. (7) into Eq. (1) and evaluating the classification accuracy p using α as a parameter, the result is as shown in Fig. 7. In the figure, the solid line is the calculated value of Eq. (7), and the broken line is the estimated value.
When the aircraft data used in the analysis were classified by the maximum likelihood method, the range α of each class was about 10 to 120. Therefore, Fig. 7 requires the classification accuracy when α is changed in 20 steps from 30 to 90.

D. Relationship between IFOV and Mixing Ratio
Estimated from the land cover classification map of the Sayama area used in the analysis, the average of the radius r was determined to be about 20 m when the boundary between each class was assumed to be an arc. Therefore, here, r was changed to 10 to 30 m, and Eq. (4) was opened to evaluate the effect of the change of the Mixels by the instantaneous visual field on the classification accuracy. Fig. 8 shows the results. The figure also shows the calculated values of Crapper [11] and the experimental values of Jackson [12].
Since the calculated value of Crapper is based on the assumption that the instantaneous visual field is sufficiently smaller than the class size, the calculated value of the model proposed in this paper asymptotically approaches the region where the instantaneous visual field is small. Also, Jackson's experimental values correspond to those where r in the calculated values of this model is about 25 m.   Comparing the experimental values (circles in Fig. 8) for calculating the ratio of the Mixels for the images of each instantaneous visual field with the calculated values of this model, it can be seen that r is about 20 m. This experimental value is based on the assumption that the result of classifying aircraft MSS data with an instantaneous field of view of 1.25 m into 10 classes by maximum likelihood classification is 0 (%) for the Mixels, and this classification result is the MSS image of each instantaneous field of view. In each pixel, the ratio of the number of categories that are composed of a plurality of categories is calculated as the number of Mixels.
This indicates that, when the analysis target area is classified into 10 classes in Table I, the average of the class sizes can be considered to be equivalent to a circle with a semicircle of about 20 m.

E. Relationship between IFOV and Classification Accuracy
The total classification accuracy p t considering the variance and the Mixels was calculated by Eq. (5) and shown in Fig. 9. At this time, 20m was selected as the parameter r, and 50, 70, and 90 were selected for α. In addition, Fig. 9 also shows the classification accuracy p obtained by actually performing the maximum likelihood classification, and ( Table  1). The two values are almost the same, especially when α = 90, which indicates that the method proposed in this paper for finding the optimal spatial resolution is appropriate.

V. CONCLUSION
Optimum Spatial Resolution of Satellite Based Optical Sensors for Maximizing Classification Performance is investigated. Also, classification performance assessment method considering spatial resolution of satellite based optical imagers is proposed. Optimum spatial resolution which makes the highest classification accuracy is determined from spatial frequency components, spectral features of objects and classification method.
The validity of the method of obtaining the ratio of the Mixels and the variance of the pixel values and the method of deriving the classification accuracy considering them were confirmed by the analysis example of the Sayama area of Japan scene of Landsat satellite imagery data. According to this method, the classification accuracy can be estimated by giving the class size in the feature space and the radius when the class region in the image space is approximated by an arc. It is also possible to find the optimum spatial resolution that maximizes the accuracy.

VI. FUTURE RESEARCH WORKS
The proposed method is adopted in the real earth observation satellite imagery data, and it is a future subject to realize a more usable classification method. Also, optimum spatial resolution would be better to be realized with actual remote sensing satellite based optical sensors.

ACKNOWLEDGMENT
The author would like to express our sincere thanks to Dr. Craper of CSIRO, Prof and Strahler of Hunter College, former Professor Tsuchiya of Chiba University and others for their valuable advices and discussions. The author, also, would like to thank Professor Dr. Hiroshi Okumura and Professor Dr. Osamu Fukuda for their valuable discussions.