Brainwaves for User Verification using Two Separate Sets of Features based on DCT and Wavelet

This paper discusses the effectiveness of brain waves for user verification using electroencephalogram (EEG) recordings of one channel belong to single task. The feature sets were previously introduced as features for EEG-based identification system are tested as suitable features for verification system in this paper. The first considered feature set is based on the energy distribution of DCT’s or DFT’s power spectra, while the second set is based on the statistical moments of wavelet transform, three types of wavelet transforms is proposed. Each set of features is tested using normalized Euclidean distance measure for the matching purpose. The performance of the verification system is evaluated using FAR, FRR, and HTER measures. Two publicly available EEG datasets are used; first is the Colorado State University (CSU) dataset which was collected from seven healthy subjects and the second is the Motor Movement /Imagery (MMI) dataset which is a relatively large dataset was collected from 109 healthy subjects. The attained verification results are encouraging when compared with the results of other recent published works, the best achieved HTER is (0.26) when the system was tested on CSU dataset, while the best achieved HTER is (0.16) when the system was tested on MMI dataset for the features which based on the energy of DFT spectra. Keywords—Electroencephalogram (EEG); wavelet transforms; DCT; DFT; energy features; statistical moments; Euclidean measure


I. INTRODUCTION
New biometric traits based on physiological signals, such as EEG and ECG signals were recently explored instead of traditional biological traits.The perfect biometric trait should have the following characteristics: very low intra-class variability, very high inter-class variability, stability over time and universality [1].Typical biometric traits such as fingerprint, voice, and retina, are subject to physical damage such as dry skin, loss or changes of voice, severe injuries such as missing hands or figures, aniridia (i.e.loss of the iris), or burned fingers, etc. [2].Recent studies have shown that the EEG signals have biometric possibility because the brain signals are distinctive and impossible to replicate and/or steal.Person identification and verification are two different types of biometric applications, the goal of person identification is to identify unknown individual from a group of persons (i.e.matching the input pattern of one person against all the records in a templates database), while the goal of person verification is to confirm or deny the claimed identity [3].The previous work [4] focused on the person identification, while this paper is particularly interested in person verification.
Palaniappan [5] proposed two stage authentication approach using AR coefficients, channel spectral powers, differences of inter-hemispheric channel spectral power, inter-hemispheric channel linear and non-linear complexity as features, after filtering the signals with Finite Impulse Response (FIR) filter, and then he used Principal Component Analysis (PCA) to reduce feature vector size.Finally he tested five subjects from CSU dataset using Manhanttan distance, he achieved best result with FAR and FRR equal to zero.Altahat et al. [6] explored the reduction of EEG channels to reduce the complexity and cost of EEG-based authentication system.In this work the signal Power Spectral Density (PSD) was considered as features.They proved that the reduced channels set enhanced the system performance and achieved total HETR (14.69%) when it was tested on (106) subjects from MMI dataset.
Fraschini, et al. [7] introduced an approach based on phase synchronization, to explore individual distinctive brain network organization.Their proposed method is based on four main steps.The first step is band-pass filtering in which "eegfilt" function was used to filter the raw EEG signals.The second step is "functional connectivity estimation" which was performed using PLI for estimating pair-wise statistical interdependence between EEG time series.The third step is "brain network reconstruction" in which the functional network is represented as a weighted graph, where each node in the graph represented EEG channel, and each edge represented functional connection, where the PLI value was used as the strength of the connection.The fourth step "characterization" is to characterize the functional brain organization, in order to estimate the significance of each node in the network they focused on a centrality measure.The best EER was achieved in gamma band; it is (0.044%) for (109) subject.[8] proposed the use of EEG signals for both authentication and cryptographic key generation.They used Fast Fourier Transform (FFT) and then Daubechies wavelet (db8) to extract features by calculating statistical information on the wavelet sub bands, in this paper DFT is proposed as a separate extraction method by calculating the energy averages of DFT's power spectra as well as wavelet Daubechies (db4) is proposed as a separate method by calculating the statistical moments to all sub bands.Two types of classifiers were tested: Support Vector Machine (SVM) and Bayesian network, they achieved best accuracy rate (100%) when the system was tested on 7 subjects.

Bajwa and Dantu
Despite the encouraging achieved results on EEG-based authentication system, the related works have faced complications in feature extraction stage and the fusion of www.ijacsa.thesai.orgfeatures from multiple channels or tasks, also the using of many techniques starting from noise removing until classification or matching step.
The main addressed problems in this paper: 1) number of required electrodes and mental tasks; where the feature sets are extracted under the adopted condition (i.e., single channel and single task) in [9], [4] and tested in the verification mode in this paper; 2) the complexity of feature extraction and noise removal; all the proposed methods make the system fast, and simple using fast code for DFT and DCT without need for preprocessing step; 3) the normalized Euclidean distance measures used instead of the complex classification algorithms.This paper is organized as follows: Section 2 presents the description of used datasets and the proposed methods, Section 3 discusses the experiments result, Section 4 discusses previous works related to this paper, and Section 5 presents conclusions.

II. MATERIALS AND METHODS
The proposed EEG-based verification system is based on the following main stages for verification purpose just like the proposed identification system:  Feature extraction stage.
 Feature analysis and selection stage.

 Matching stage.
Different transform algorithms is proposed to perform the mapping in the literature; in this paper the input EEG signal is mapped to frequency domain using DCT, DFT, and three different wavelets algorithms in order to extract the main discrimination features.Feature extraction stage is aimed to extract the most discriminate features from the transformed EEG signal.The task of feature analysis and selection stage is to select the best combination of discriminative features.
In matching stage, the normalized Euclidean distance measures are used to verify the claimed identity of input pattern.

A. Dataset
Two public datasets are used in the conducted tests.The first one is Colorado State University dataset which is a public dataset collected by Keirn and Aunon [10].It is a small dataset consists of the EEG recordings of seven healthy subjects.Each subject was performed some mental tasks.These tasks are: Baseline task, Letter composing task, mathematics task, rotation task, counting task.Signals were recorded from the positions C 3 , C 4 , P 3 , P 4 , O 1 and O 2 ; see Fig. 1.The taken EEG signals duration is 10 sec.with sampling rate of (250 sample/sec) [11].This dataset holds an error that occurred in one of subjects (i.e., 4 th ) in letter composing trails [12], [10].
Second EEG dataset is Motor Movement /Imagery dataset which is a relatively large dataset consists of EEG recordings for 109 healthy volunteers; it was described in [13].In this dataset the participants performed 14 trails of the following tasks: two Baseline tasks with eyes open and eyes closed, Task1 (open and close left/ right fist), Task2 (imagine the opening and closing of left/right fist), Task3 (open and close both fists and both feet), Task4 (imagine opening and closing both fists and both feet).The dataset contains the recordings of 64 channel based on 10-20 international system of electrodes placement as shown in Fig. 1.The recording duration is ranging from 1 minute to 2 minutes except for subject (106) who performed task3 for (36 sec.and 294 msec.) in attempt 5; the EGG recording was sampled at (160 Hz) [13], [10].
Table I shows the number of samples for each subject class in CSU and MMI datasets (Note: subject 4 has 9 samples for the letter-composing task because of the error that above mentioned), and the number of samples for each subject class in MMI dataset (Motor Movement/imagery dataset).

B. Features Sets
In this stage, two separate sets of features were used to generate the feature vectors and tested for the verification purpose, they are energy based features and/or the statistical moments.
1) Energy of DFT's and DCT's spectra: Discrete Fourier and Discrete Cosine transforms are considered to map the input EEG signal from a time-domain to frequency domain.DFT's power spectra consist of the sine and cosine components, while DCT use only the cosine functions, it is a Fourier related but just using real numbers [15], [11].The DCT's general mapping equation is given by (1), while DFT's general mapping equation is given by ( 2): Where C(u) and F(u) is the u th coefficient of the DCT and DFT, respectively, and s() is the input EEG signal.
After the mapping step, the obtained AC coefficients (i.e., coefficients with u>0) are divided into a number of blocks (or bands) and the energy of each block is calculated using (3) [16]: Where, T(i) represents the transform, F(u) or C(u), coefficients array; ( ) is the energy of j th block; L is number of coefficients belong to each block; j=0…P-1; P=(N-1)//L is the total number of blocks.The array en() is considered the feature vector.

2) Statistical moments of discrete wavelet transforms:
The second set of features is the statistical moments of Discrete Wavelet Transforms sub bands.The wavelet transform computes the inner products of a signal with a family of wavelets to decompose the EEG signal (to scale-shift domain) with keeping location in time information; unlike DFT and DCT which maps the input signal to frequencies that making it up regardless of time information.DWT uses two filters (i.e., high pass filter and low pass filter) [17], [11].Three types of wavelet transform were proposed in the previous work [4]; the first one is Haar Wavelet transform which is the simplest wavelet type, it computes the sums and differences of input signal, the low and high filters of HWT is given by ( 4) and ( 5) [17]: Where, i=0…N/2; N is the length of input signal.L(i) is the i th approximation coefficient, h(i) is the i th detailed coefficient.
The third type of wavelet transform is bi-orthogonal (Tap9/7), it transforms the input EEG signal by applying three consecutive phases: (i) split phase (ii) lifting phase and (iii) scaling phase [20].
The four lifting steps and two scaling steps are described by the following equations: Scaling phase: Table II shows the coefficients {a, b, c, d, and k} values.
After transforming the input signal using wavelet transform, one of the following two set of statistical moments is adopted to be applied on the obtained sub bands.They are described by the following equations: The 1 st Statistical Moments Set: Where, S(i) is the i th sample, k is the signal length, and ̅ is the mean which is determined as: The 2 nd Statistical Moments Set: Where, ΔS(i)=S(i)-S(i+1) for (i=0,…, p-2), and ̅ is similar that given in (15) but instead of S(i) it is ΔS(i).The power n is taken (0.5, 0.75, 1, 2, and 3).

C. Features Analysis and Selection Stage
This step is applied to reduce the feature pool size and to select most related and discriminative features with lowest within distance and highest between discrimination, then combining the best set of features that led to best verification accuracy [21], [22].

D. Matching Stage
The input pattern is matched with the template(s) of the class subject that the user claims to be in order to verify his identity; normalized Euclidian distance measure given by ( 17) is used to calculate the distance between the input pattern and the class template(s) [23], and similarity distance threshold is checked to accept or deny the claimed identity: ) Where, Si ={si(0), si(0),…, si(p-1)} is the feature vactor of a sample belong to ith class, Tj= {ti(0), ti(0),…, ti(p-1)} is the template feature vector of jth class and σj={σi(0), σi(0),…, σi(p-1)} is the standard deviation vector of jth template.

III. RESULT AND DISCUSSION
The accuracy of verification system with all proposed feature extraction methods was tested on the two adopted public datasets.Each set of features is extracted from EEG signal belong to single task and single channel.The best attained system HTER was 0.26 for CSU dataset, while the best achieved HTER for MMI data set is 0.16.The results of the tests are described in details in the following sections:

A. Verification Results
The Receiver Operating Characteristic (ROC) Curve illustrates the performance of verification system by plotting the False Rejected rate (FRR) which is given by ( 19) and measures the proportion of incorrectly rejected genuine patterns, against the false Accepted rate (FAR) which is given by ( 18) and measures the proportion of incorrectly accepted imposter patterns, at various threshold settings to check the intersection point between FRR and FAR in which the Half Total Error rate (HTER) is calculated using (20) to evaluate the performance of the system [8], [24]: While the accuracy of the verification system can be determined using the following equation: (21) Where P is the number of genuine patterns, and N is the number of imposter patterns [16].
1) Energy of sliced DFT and DCT spectra's results: Table III shows the results of the verification system which was proposed in [9] using Energy of Sliced DFT Spectra when tested on CSU dataset when some enhancements were made to the system, while Table IV shows the verification results of the system when tested on MMI dataset.The best achieved HTER is 0.26 at threshold 16.6 for channel P4 belong to Rotation task, while the best achieved HTER is 0.16 at threshold 26.1 for channel C2 belong to Task1.Fig. 2 and 3 show the ROC curve of the P4_Rot and C2_Task1 feature sets.Tables V and VI show the attained verification results of the system based on the energy of sliced DCT spectra.The best achieved HTER is 0.4 at threshold (20.2) for the feature set extracted from channel P 4 and Rotate task from CSU dataset, while best achieved HTER is 0.35 at threshold (25.2) for the channel Cz belong to Task 1 from MMI dataset.

2) Statistical moments of wavelet sub-bands features results:
In the following sections the results of HWT, db4, and Tap9/7 features which based on the statistical moments of the sub-bands are showed.The conducted tests show that the Haar and db4 wavelets show performance less than the features based on the energy of DFT and DCT, and Tap9/7.
Tables VII and VIII show some conducted tests of Haar wavelet transform using 2 nd set of statistical moments on CSU and MMI datasets, respectively.

B. Processing Time Parameter
In this section; the elapsed processing time on the introduced recognition system is presented.Table XIII shows the average processing time, (in terms of milliseconds) of the proposed methods; when they applied on CSU data set.Table XIV is the average processing time when the methods are applied on MMI datasets.Taking into account the recording time for CSU dataset is (10 sec) with sampling rate (250 Hz), and the taken recoding time for MMI CSU datasets is (1 minute) and sampling rate is (160 Hz); the determined matching time is for one-to-many comparisons.The Computer specification that used in the tests is Intel® Core ™ i5-2450M CPU with (4GB) RAM, the operating system is windows7 (64bit), and the development programming language is Microsoft visual C#.IV.COMPARISON WITH RECENTLY RELATED WORKS Some of the related published works on EEG-based verification system have achieved good results, some of them reached 100% on CSU dataset but many of them used more than one channel or task for verification tasks.Table XV shows that the attained results in this paper is competitive when compared with the results of other published works on CSU dataset and Motor Movement/Imagery dataset; taking into account that all proposed methods in this article has low computational complexity, they require very small execution time because the system uses single channel and single task, and fast algorithms.All published works haven't mentioned the elapsed processing time clearly, so we can't compare with them.

V. CONCLUSION AND FUTURE WORK
In this paper the proposed feature extraction methods for verification purpose were tested, and make a comparison among them.For each proposed method the system was fast, simple and achieved encouraged results.The conducted tests showed that the best achieved HETR is 0.26 for DFT feature set when was applied on CSU database, and 0.16 when was applied on MMI dataset.DFT, DCT, and Tap9/7 showed performance better than Haar and Daubechies (db4) wavelet transforms methods, but WT methods showed complexity and processing time less than of that DFT and DCT.

Fig. 2 .
Fig. 2. ROC curves show the interception of FRR and FAR at optimal threshold for the feature set (P4-Rot).

Fig. 3 .
Fig. 3. ROC curves show the interception of FRR and FAR at optimal threshold for the feature set (C2-Task1).

TABLE I .
THE NUMBER OF SAMPLES FOR EACH CLASS IN CSU AND MMI DATASET www.ijacsa.thesai.org

TABLE III .
FRR, FAR, ACCURACY, AND HTER OF THE ENERGY OF SLICED DFT SPECTRA FEATURES, CSU DATASET

TABLE V .
FRR, FAR, ACCURACY, AND HTER OF THE ENERGY OF SLICED DCT SPECTRA FEATURES, CSU DATASET

TABLE VII .
FRR, FAR, ACCURACY AND HTER OF THE STATISTICAL MOMENTS FOR 2 ND SET OF HWT FEATURES, CSU DATASET

TABLE VIII .
FRR, FAR, ACCURACY, AND HTER OF THE STATISTICAL MOMENTS FOR 2 ND SET OF HWT FEATURES, MMI DATASETTable IX shows results of some conducted tests of db4 using 2 nd set of statistical moments on CSU dataset, while The best results of the verification system based on Statistical Moments of Tap9/7 Sub-bands are showed for CSU dataset using Statistical Moments 2 nd set in Table XI, and for MMI dataset using Statistical Moments 1 st set in Table XII.

TABLE IX .
FRR, FAR, ACCURACY, AND HTER OF THE STATISTICAL MOMENTS 2 ND SET OF DB4 FEATURES, CSU DATASET

TABLE X .
FRR, FAR, ACCURACY, AND HTER OF THE STATISTICAL MOMENTS 1 ST SET OF DB4 FEATURES, MMI DATASET

TABLE XII .
FRR, FAR, ACCURACY, AND HTER OF THE STATISTICAL MOMENTS 1 ST SET OF TAP9/7 FEATURES FOR MMI DATASET

TABLE XIII .
THE AVERAGE PROCESSING TIME RESULTS (IN MSEC) FOR CSU

TABLE XIV .
THE AVERAGE PROCESSING TIME RESULTS IN (MSEC) FOR MMI

TABLE XV .
COMPARISONS WITH OTHER PUBLISHED WORKS ON CSU DATASET AND MMI DATASET BASED ON NUMBER OF SUBJECTS, NUMBER OF USED CHANNELS AND TASKS.