Human Gait Gender Classification Using 3d Discrete Wavelet Transform Feature Extraction

—Feature extraction for gait recognition has been created widely. The ancestor for this task is divided into two parts, model based and free-model based. Model-based approaches obtain a set of static or dynamic skeleton parameters via modeling or tracking body components such as limbs, legs, arms and thighs. Model-free approaches focus on shapes of silhouettes or the entire movement of physical bodies. Model-free approaches are insensitive to the quality of silhouettes. Its advantage is a low computational costs comparing to model-based approaches. However, they are usually not robust to viewpoints and scale. Imaging technology also developed quickly this decades. Motion capture (mocap) device integrated with motion sensor has an expensive price and can only be owned by big animation studio. Fortunately now already existed Kinect camera equipped with depth sensor image in the market with very low price compare to any mocap device. Of course the accuracy not as good as the expensive one, but using some preprocessing method we can remove the jittery and noisy in the 3D skeleton points. Our proposed method is to analyze the effectiveness of 3D skeleton feature extraction using 3D Discrete Wavelet Transforms (3D DWT). We use Kinect Camera to get the depth data. We use Ipisoft mocap software to extract 3d skeleton model from Kinect video. From the experimental results shows 83.75% correctly classified instances using SVM.


INTRODUCTION
In recent years, there has been an increased attention on effectively identifying individuals for prevention of terrorist attacks.Many biometric technologies have emerged for identifying and verifying individuals by analyzing face, fingerprint, palm print, iris, gait or a combination of these traits [1]- [3].
Human Gait as the classification and recognition object is the famous biometrics system recently.Many researchers had focused this issue to consider for a new recognition system [4]- [11].Human Gait classification and recognition giving some advantage compared to other recognition system.Gait classification system does not require observed subject's attention and assistance.It can also capture gait at a far distance without requiring physical information from subjects.
There is a significant difference between human gait and other biometrics classification.In human gait, we should use video data instead of using image data as other biometrics system used widely.In video data, we can utilize spatial data as well as temporal data compare to image data.
There are 2 feature extraction method to be used in gaitclassification: model based and free model approach [12].Model-based approaches obtain a set of static or dynamic skeleton parameters via modeling or tracking body components such as limbs, legs, arms and thighs.Gait signatures derived from these model parameters employed for identification and recognition of an individual.It is obvious that model-based approaches are view-invariant and scale-independent.These advantages are significant for practical applications, because it is unlikely that reference sequences and test sequences taken from the same viewpoint.Model-free approaches focus on shapes of silhouettes or the entire movement of physical bodies.Model-free approaches are insensitive to the quality of silhouettes.Its advantage is a low computational costs comparing to model-based approaches.However, they are usually not robust to viewpoints and scale [13].
Gender classification along with human gait recognition has getting the researchers to find its best methods.Wide implementation make they seem so attractive research.The implementation will not only enhance existing biometrics systems but can also serve as a basis for passive surveillance and control in "smart area" (e.g., restricting access to certain areas based on gender) and collecting valuable demographics (e.g., the number of women entering a retail store, airports, post office, or public smoking area etc. on a given day) Imaging technology developed quickly this decades.Motion capture (mocap) device integrated with motion sensor has an expensive price and can only be owned by big animation studio.Fortunately now already existed Kinect camera equipped with depth sensor image in the market with very low price compare to any mocap device.Of course the accuracy not as good as the expensive one, but using some preprocessing we can remove the jittery and noisy in the 3D skeleton points.Our proposed method is part of model based feature extraction and we call it 3D Skeleton model.3D skeleton model for extracting gait itself is a new model style considering all the previous model is using 2D skeleton model.The advantages itself is getting accurate coordinate of 3D point for each skeleton model rather than only 2D point.We use Kinect to get the depth data.We use Ipisoft mocap software to extract 3d skeleton model from Kinect video.Those 3D skeleton model exported to BVH animation standard format www.ijarai.thesai.orgfile and imported to our programming tool which is Matlab.We use Matlab to extract the feature and use a classifier.We create our own gender gait dataset in 3D environment since there are not exist such a dataset before.

II. PROPOSED METHOD
The classification of gender gait quality in this paper consists of three part, preprocessing, feature extraction, and classification.Figure 1 shows the complete overview of proposed human disable gait quality classification.Using Kinect camera have one advantage compare to usual RGB camera.The skeleton created is in 3D space.One can get 2D images from different view angle using only single camera.Figure 2 below shows the 2D skeleton image created from different view angle at the same frame.This is useful to enhance the accuracy of the classification since some paper proposed using multi view image [10], [14]- [16].However, these papers will only using one view for the analysis.

A. Preprocessing
First, take the Video data using Kinect and IpiRecorder to record the depth data along with RGB video data.To get the video data, there are some recommendation should be considered:  Second, processed the depth video data in IPISoft motion capture application.IPISoft will create the 3D skeleton model from video depth recorded using some tracking motion method.The first step is to take only the gait scene, and remove unimportant video scene or we call the Region of Interest (ROI) video.Third, Create the skeleton 3d model using the tracking motion method, remove the jittery and noises, and export the skeleton model to BVH file format in IPISoft.
Fourth, Read the BVH file, extracted the feature, and classify the feature.

B. Dataset
Unfortunately, there are no Kinect Video Depth gait dataset exists until now.All exist gait dataset is using ordinary RGB camera like USF gait dataset, SOTON gait dataset, and CASIA gait dataset.Figure 3 shows the example of CASIA gait dataset.This time also we should the Region of Interest video to be processed.Instead of all the video sequence that we use, we could only take the most important part of the video sequence.Once we put the skeleton to the same position with the subject, we can refitting pose using the application and start tracking.Jittery removal and Trajectory filtering can be done after the tracking finished.
The skeleton sequence result can be import to BVH file standard.Figure 6     The block diagram of the 3D analysis filter bank is shown in Figure 9.

D. Feature Extraction
To extract the feature using 3D Discrete Wavelet Transform, we can prepare 2 kind of data.The first data is raw data and the second data is the resized data.The effectiveness and Classification accuracy of each data using statistical feature will be shown in this paper.The resized data have an advantage over the raw data.In Resize data, we can use whole sub bands or decomposition data and process them in the classifier directly.This can be done because all the dataset have the same dimension.This paper will cover the experimental result of resized data and extracted statistical feature, but this paper will not discuss analyzing of using whole the sub bands feature extraction.To extract the feature in the dataset created, one have to consider about the image size.If all the image is used, it will be costly.One have to extract the skeleton image only and not all the image, thus call it Region of Interest (ROI).One can do automatic ROI using simple image detection since the image is in binary space.After the ROI done, one can extract feature from the data directly or one can resize the data.Thus, there are two kind of data which is raw data and resized data.
Data have to same in image file and frame amount to be used as a resized data.The method used in this paper to create resized data is image resizing and frame cropping.In image resizing, biggest skeleton image ROI will be used as a reference because in this method we don't have to remove some amount of data and those removed data could be valuable information to the system.After finish the image resizing, one can start to crop the frame.This paper will crop the frame based on the smallest amount of video frame, thus all the data have same amount of frame.This paper using middle part of smallest frame amount as a cropped frames.If the smallest frame amount is x, then the video frame crop start at y is round ((total_framex)/2) and end in z = y+x.This paper will used some famous classifier to compare and analyze their best correct classification rate.The decomposition result of Level 1 from 3D DWT of Haar Wavelet will get 8 sub bands which is LLL, LLH, LHL, LHH, HLL, HLH, HHL, and HHH.We will use 3 statistic feature that was used in previous research which is mean, standard deviation, and energy.
The formula for the energy used in Eq.( 1) below.

E. Classification
This paper will use two famous classifier which is Naïve Bayes and SVM to analyze and compared the results.SVM (Support Vector Machine) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis.The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.

III. EXPERIMENTAL RESULT
We start with using Resized data.From 80 dataset with 40 male and 40 female video, we will use SVM and Naïve Bayes as the classifier.First step is we are selecting the best feature in each classifier.There are two method to be use which is Wrapper Method and Ranked Method.We also conduct those two method in two kind of data preprocessing, which is without preprocessing and data after discretization filter preprocessing.
Table 1 is the result of the selected feature using both methods without preprocessing data (raw data type).www.ijarai.thesai.orgTable 2 is the result of the selected feature using both methods with discretization data (raw data type).Table 3 is the result of the selected feature using both methods without preprocessing data (resized data type).Table 4 is the result of the selected feature using both methods with discretization data (resized data type).Table 5 shows the result of Correct Classification Rate (CCR) for the selected features (raw data type).Table 6 shows the result of Correct Classification Rate (CCR) for the selected features (resized data type).As seen in the tables above, the best CCR is in raw data type using SVM and discretized data and ranked method selected feature.Table 7 is detail accuracy by class using SVM classifier.

IV. CONCLUSION
The proposed method uses Kinect depth sensor camera and Ipisoft motion capture software to generate 3D skeleton model.Ipisoft itself is special purpose application to create skeleton so user can use the motion to their computer generated character motion.
The 3D skeleton generated will then extract the 2D image in one view angle and create 2 model data type which is raw and resized video data type.Using Level 1 Haar 3D DWT, we got 8 sub bands and using 3 statistical feature for all 8 sub bands (Mean, Standard deviation, and Energy).By selecting the best feature and classify the results using SVM and Naïve Bayes, the result shows is Table 5 and Table 6.The best result achieved in raw data type using Ranked method feature selection and discretized data which is 83.75% CCR.

Figure 1 :
Figure 1: Proposed human gait gender classification

1 .
Using 9 by 5 feet room space to get best capture.2. Object should be dressed in casual slim clothing, avoid shiny fabrics.3. We should ensure that the whole body including arms and legs is visible during the recording states.Beginning from T-Pose and the recording can be started.

Figure 2 .
Figure 2. 2D skeleton image created from different view angle at the same frame Figure below show the example of video recording.

Figure 3 :
Figure 3: Example of CASIA gait dataset To conduct the experiment, we should prepare the dataset.We will use the Kinect Gait Dataset to analyze and classify gender using gait.The proposed research will search the capability of Kinect and 3D Skeleton model and use their 2D images for gait classification.

Figure 4 :
Figure 4: T-Pose Position before the recording begin Figure 5 below show the 3D skeleton tracking motion sequence.First task is specifying subject's physical parameter like gender and height.IpiSoft will detect the ground plane automatically and provide the 3D skeleton in T-Pose position.Our next job is try to put the T-Pose skeleton in the same position with the subject T-Pose position in the first sequence of video.
and 7 below shows the BVH file result and preview in BVH file viewer and Matlab.

Figure 8 .
Figure 8.The resolution of a 3-D signal is reduced in each dimension

Figure 9 .
Figure 9. Block diagram of a single level decomposition for the 3D DWT Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other.An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible.New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

TABLE 1 .
FEATURE SELECTION WITHOUT PREPROCESSING DATA FOR RAW

TABLE 2 .
FEATURE SELECTION WITH DISCRETIZATION FILTER DATA FOR

TABLE 4 .
FEATURE SELECTION WITH DISCRETIZATION FILTER DATA FOR

TABLE 5 .
CORRECT CLASSIFICATION RATE FOR EACH SELECTED FEATURE IN EACH CLASSIFIER FOR RAW DATA TYPE

TABLE 6 .
CORRECT CLASSIFICATION RATE FOR EACH SELECTED FEATURE IN EACH CLASSIFIER FOR RAW AND RESIZED DATA TYPES

TABLE 7 .
DETAIL ACCURACY BY CLASS USING SVM CLASSIFIER