Task Sensitivity in Continuous Electroencephalogram Person Authentication

This research investigates on the task sensitivity in multimodal stimulation task for continuous person authentication using the electroencephalogram (EEG) signals. Pattern analysis aims to train from historical examples for prediction on the unseen data. However, data trials in EEG stimulation consists of inseparable cognitive information that is difficult to ensure that the testing trials contain the cognitive information matching to the training data. Since the EEG signals are unique across individuals, we assume that multimodal stimulation task in EEG analysis is not sensitive in train-test data trials control. Data trial inconsistency during training and testing can still be used as biometrics to authenticate a person. The EEG signals were collected using the 10-20 systems from 20 healthy subjects. During data acquisition, subjects were asked to operate a computer and perform various computer-related tasks (e.g.: mouse click, mouse scrolling, keyboard typing, browsing, reading, video watching, music listening, playing computer games, and etc.) as their preferences, without interruption. Features extracted from Welch’s estimated Power Spectral Density in different frequency bands were tested. The designed authentication approach computed intraand inter-personal variability using Mahalanobis distance to authenticate subject. The proposed EEG continuous authentication approach has succeeded. Data collected from multimodal stimulus disregard of task sensitivity able to authenticate subject, where the highest verification performance shown in the low-Beta frequency band. Evidence found that effective frequency region on the middle band was anticipated due to the data collected was based on subject voluntary actions. Future research will focus on the effect of subject voluntary and involuntary actions on the effective frequency region. Keywords—Electroencephalogram; continuous authentication; task sensitivity; multimodal stimuli; Mahalanobis distance


I. INTRODUCTION
The conventional biometrics use human physiological traits such as fingerprint, iris, face, etc. for authentication. However, one of the limitations of these physiological traits is easily prone to forgery. This is due to the involvement of human exposed body parts that are easy to obtain and replicate. Besides, the system that commonly available in the market is prone to security mistakes such as invasion by hackers or misconduct of authorized personnel. Therefore, alternative biometrics is required due to the needs of increasing level of the information technology security of a system. The human brainwaves that measure in Electroencephalogram (EEG) has proven to fulfill all the biometric requirements (universality, uniqueness, constancy, collectability) [1]. Besides, it has proven to be unique across individuals and is allowed for person authentication [2]. One of the characteristics of EEG is non-stationary or quasi-stationary over time, where this rises the problem of template permanence that often discussed by the research community [3]. However, such characteristics do make possible for spoof resistance, liveness detection, cancellability [4] and etc. which is beneficial for person authentication.
Like most of the security implementation, the common brainwaves based biometric authentication is allowed for onetime authentication only. The major drawback of static authentication (SA) is the system will unaware of the user anymore once the access has been granted to the client. Thus, this provides chances to an intruder for a spoofing attack. In such cases, continuous authentication (CA) is believed to provide security awareness against imposters. CA involves repetition verification process along the time while the user is still logged on to the system. Users whoever using the system will be monitored in the complete session and ensure the authority only given to the correct person. Since CA requires repetition obtaining the data, thus it is more practical if enough user behavioral data can be acquired passively without the consciousness of the user to improve practicality [5]. The excellent time resolution with the continuous nature of EEG signals enables precise detection of brain activity. This allows more possibility for continuous person authentication in realtime which able to detect an imposter and response in just seconds.
EEG signals analysis is proven for person authentication because the cognitive response is individuating in different persons when response towards similar cognitive tasks (e.g.: resting, visual stimulus, mental imaginary, and etc.). Although experiments involved multimodal stimulus that fulfilled unconscious set-up has introduced in several studies to improve the practicality for brainwave continuous authentication [6], [7]. However, these experiments involve only a consistent EEG stimulus set throughout the whole recording session, where the same set of EEG cognitive tasks is possible to occur in every data trial for both data training and testing sets. Thus, a non-specified set of multimodal stimuli of EEG recording is expected to improve the robustness of the experiment for person authentication. This is because the cognitive tasks are indistinguishable and different between data trials during the segmentation of EEG signals. Also, this makes possible for testing trials to contain EEG tasks that may not be experienced in the training set. *Corresponding Author. www.ijacsa.thesai.org Thus, this paper aims to propose a flexible EEG recording approach that able to cater to the task sensitivity for multimodal stimulus. Besides, the proposed approach should enable unconscious data collection which suitable to be used for continuous authentication. Therefore, EEG signals will be acquired from the user performing random tasks by themselves without limitations. We hypothesized that the data trials consisting of inseparable cognitive information that possibly mismatch during training and testing can still be used as biometrics to authenticate a person. Questions that addressed in this work: (1) Would the multimodal EEG stimulus able to authenticate person disregard of task sensitivity? (2) What is the effective frequency region in Power Spectral Density (PSD) for the multimodal EEG stimulus disregard of task sensitivity?
This paper is structured as follows: Section II describes the overview of EEG based biometrics including EEG characteristics, the process flow of EEG authentication, EEG protocols, and the related works. Section III presents the proposed solution from the experiment paradigm design until the performance evaluation. Section IV presents the results and discussion where Section V draws conclusions and suggests the direction for future work.

A. EEG Characteristics
EEG measured the spontaneous electrical changes inside the brain that can be obtained by placing sensors along the human scalp. The acquired raw brainwaves often plotted in the amplitude-time graph that explains the voltage fluctuation over a period of time, in a specified brain region. Generally, the informative brain activities lie on several frequency bands that categorized as follow: 1) Delta, δ waves (0.5-4Hz). Waves with the highest amplitude and slowest activity appear in the deep sleep and unconscious state.
3) Alpha, α waves (8-13Hz). Appears during relaxation or dreaming and disappear while human during thinking and alert state.

B. Process Flow of EEG Authentication
The brain biometrics has two types of applications: authentication or identification [8]. The decision mechanism for authentication (or is often called verification) involved one to one matching only where the results will be either accept or reject the user. However, identification comparing one-tomany options in the databases where an identified label of the subject will be the output. However, several steps must be fulfilled to complete brainwaves recognition as depicted in Fig. 1: EEG signals collection, signal pre-processing, feature extractions, template matching/classification and classified output.

C. EEG Protocols
Typically, the EEG signal acquisition protocol can be grouped into three (3) categories based on the review from [8]- [10], which are: resting-state/relaxation, event stimulation, and mental imaginary. The resting-state protocol requires subject to sit with relaxing in eye open or eye closing condition, EEG signal will be acquired during this human quiescent state. In event stimulation protocol, cognitive stimulus in different forms will be presented to the subject (e.g.: visual, audio, somatosensory, etc.) because the evoked potential that triggered from the presented stimulus able to differentiate individual. In the mental imaginary protocol, brain signals will be captured while the subject was performing a certain mental task (e.g.: imagining hand movement, rotation, solving arithmetic problem, etc.). Overall, the EEG of a person is recorded from their non-volitional or volitional responses in an engaging session. This time frame will be selected and to be used for further analysis to authenticate individuals. However, several problems would like to address in the CA point of view based on the existing EEG collection protocol.
First, although the relaxation protocol able to acquire prolonged continuous EEG signals, however, it is impossible to expect people will be kept resting all the time. In real life, the human physical and mental state will tend to be active, where uncertainty may rise due to the occurrence of unknown experience that unable to measure in resting EEG data. Second, cognitive recording only involved a single task. To the extent of this, most of the EEG authentication scheme is based on single task training and testing, in which the template is generated from the single and distinguished type of brain task and later to be tested using the same brain task (e.g.: training and verification using left-hand motor imaginary task only), but in real life, we cannot expect human mental activity will always in a regular state. Whereas, multiple task studies have received attention later where EEG recordings from different combinations of brain tasks were used for training and testing (verification). The different design of the EEG experiment was as shown in Table I.
Studies found that the fusion of different EEG tasks in training/testing able to provide significant outcomes as compared to when evaluated individually [11], where the extensive review of multi-task study for EEG subject identification can be found in [12]. As for subject authentication, a study in [13] has conducted several. experiments to evaluate the performance using one type of task for training and tested with another task. Results show the performance remains when mismatch between training and testing tasks compare to using the same task. Also, system performance does improve if the training data involved more tasks in training and tested with another task. Thus, this gains confidence in flexibility for the design of the EEG data collection protocol. However, the above claims only applicable to mismatch training/testing between motor or imaginary tasks only. The author also tried to include resting tasks in the test set, but the performance obtained was very poor, where this highlights the first problem that we have addressed previously in this section. www.ijacsa.thesai.org  Other Tasks [13], [15] Task A and Task B Task A or Task B [13], [16], [17] Task C [13] Task A and Task B [16], [11] Multimodal Task A+B Task A+B [6], [7] Task A+B Next, the third problem to be addressed was, the conscious response of the user is required during data collection. The event stimulation and mental imaginary protocol. For example, in the visual stimulation protocol, images are presented to the subject to register the Event-Related Potential (ERP) as their template, where the relevant image needs to be presented again during verification. However, it is less suitable to let users aware of CA by kept displaying images to them, where practical CA should allow passive verification as mentioned in [5].

D. Related Works
The common EEG experiments record the user's brain wave through perceiving unimodal stimulus only, in which the single mode of EEG stimulus was presented at a time [18]. However, multiple sensory cognitive processing is more often to happen in a real-world scenario. Meanwhile, the brainwaves through EEG authentication can also be recorded without any controlled stimulation to the user to obtain continuous signals. Attentive tasks such as driving and computer operating involve multimodal stimulus where humans will expose to more than one type of stimulus from different sensory fields (e.g.: visual, auditory, spatial, tactile, and etc.) simultaneously. Study in [6] records continuous EEG from only Fp1 electrode in the simulated driving environment and achieved the best of 27% EER, the recording lasts for three (3) minutes per trial and collected twice a day for five (5) days from thirty (30) subjects. Apart from EEG, a study in [7] records continuous brainwaves in near-infrared spectroscopy (NIRS) for 60s while the user was doing typing tasks. Only a single probe placed on the subject forehead was used to minimal interruption, where this study able to obtain 0.40% EER.

A. Experiment Paradigm Design and Data Collection
A total of 20 students in Universiti Teknikal Malaysia Melaka (UTeM) comprised 10 males and 10 females aged between 20 and 29 (mean age: 23.78 ± 1.93 standard deviations) has participated voluntarily in this experiment. All of them were healthy adults, right-handers, and had normal or corrected to normal vision. Procedures were approved by the Ministry of Health Malaysia under the National Medical Research Register (NMRR-19-2372-50333). Participants signed a printed consent form after being briefed on the overall purpose of the research study and the experimental procedure before participating. Fig. 3(a) illustrated the experimental set-up for the proposed EEG recording approach, where the arrangement in the actual scenario is as shown in Fig. 3(b). The subject was first asked to wear a wireless EEG head cap and sit in front of a computer with a screen size of 15.6 inches and distance approximately 45 centimeters. The brainwave signals were measured in EEG with 20 dry electrodes sampled at 500Hz frequency. All channels positioned following 10-20 international placement systems which include P7, P4, Cz, Pz, P3, P8, O1, O2, T8, F8, C4, F4, Fp2, Fz, C3, F3, Fp1, T7, F7, and Oz. A reference electrode was placed on the left or right of the subject earlobe, A1 or A2 as illustrated in Fig. 2. The EEG device used was Neuroelectrics Enobio 20 which is a wireless and portable headset that transmits data via Bluetooth. Distance between the headset and the Bluetooth dongle was approximately 100 centimeters.
To authenticate users unconsciously, the experiment design should allow transparent monitoring. For EEG data collection, the subject was asked to operate the computer and perform any computer tasks as their preferences. Examples of computer tasks include mouse scrolling, mouse-clicking, keyboard typing, browsing (reading), video watching, music listening, playing computer games, and any other computer-related tasks. To ensure practicality, two (2) conditions were allowed. First, no restriction on the number of types of computer tasks per recording. Subjects were free to perform any computer tasks at any time as their preferences. Second, no restriction on the number of computer tasks at one-time. Subjects were free to perform different computer tasks concurrently (e.g.: listen to music while reading). However, the subject is informed to minimize their body movement such as avoid excessive hand, head, body, and face movement to reduce the captured of noise signals in the recording.
While the device was placed on the subject scalp, the obtained EEG signals were monitored in the complementary NIC v1.4 software. Color indicators were provided for every electrode to observe signal quality is good (green), moderate (orange), and bad (red) conditions. The indicator is not an impedance check but is guidance checking for line noise level, main noise level, electrode drift, and offset [19]. Thus, it is not necessary to stop the data collection if the indicator turns red. However, to reduce the capture of noise, EEG signals www.ijacsa.thesai.org collection begins while the indicator for all electrodes appears green and prolonged for at least five (5) seconds. The experiment runs for one (1) time only for each subject and the total duration for tasked recording lasted for ten (10) minutes. The experiment was done in a quiet and enclosed room dedicated to the EEG experiment.

B. Signal Pre-Processing
In his study, open-source API, MNE v0.17.1 is used for data preparation and analysis [20], [21]. This tool is widely used for EEG or MEG data analysis in Python. The block diagram of the proposed EEG-based biometric system used for continuous authentication is depicted in Fig. 4, where all the processes will be discussed in this section hereafter.  Each of the S=20 subjects in this experiment was given a numerical label to differentiate between subjects. To analyze only quality signals, the first one (1) minute of data were removed and only eight (8) minutes of data in the middle of ten (10) minutes recording were used. Thus, the time-series EEG signals from one electrode contribute to 240,000 data points (8 mins * 60s * 500Hz). Next, the data was segmented equally in 10s epoch without overlapping [7], for each subject and each electrode. Therefore, each user possesses a total of 2k=48 data trials with respective subject labeled, where all electrodes are concatenated in dimension. Data were split to a portion of 50:50, all the trials were still periodically arranged without shuffling, where the first half (k=24 trials) used for template generation (training) and another half for performance verification (testing).

C. Features Extraction
Features in frequency domain able to extract dominant brain activity in the specified frequency range. Power Spectral Density (PSD) that measured the signal power in relative frequency band was extensively used to extract features in EEG biometrics analysis. PSD provides fast computation and suitable to process continuous EEG data from simple sources which possibly contains more artifacts [22], this is suitable because the dry electrode used in the experiment has lower signal quality as compared to the wet or gel-based electrode. www.ijacsa.thesai.org Thus, we employ PSD as the feature extraction method and considered Delta δ (0.5-4 Hz), Theta θ (4-8 Hz), Alpha α (8-13Hz), low-Beta β (13-20 Hz), high-Beta (20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30) to Gamma γ (30-50 Hz) band. Although there is still no agreement on standard reference for the specific value of each frequency band should be, however, the cut-off between alpha and beta at 13 Hz is based on our preliminary experiment, and the gamma band to stop at 50Hz is based on [15]. The selection of the mentioned frequency band is because this range consisting of dominant brain activities that able to recognize person disregard of brain tasks [15], [16]. The frequency band will later be tested in a combined and separated manner to identify the effective region for better efficiency [14].
Thus, the power spectra of the processed EEG signals in each trial, each electrode, and each subject were transformed using Welch's estimation method, using 500 Fast Fourier Transform (FFT) length (this number is set based on the EEG device sampling rate and resulting to 1 resolution point for the power spectral frequency bin) with no overlapping information. Next, the logarithm of power spectra was computed.

D. Authentication Approach
Mahalanobis distance, introduced in [23], is a simple multivariate metric that measures the distance between a point to a distribution. It has been proven efficient in [7] for brain signal continuous authentication. The equation of Mahalanobis Distance is as follows: Where is the Mahalanobis distance, is the vector of observations (test data trial), and is the vector of mean and the vector of the inverse covariance matrix of the claimed subject, respectively. For each registered subject , the mean vector µ and inverse covariance matrix were first computed using the training set. This information will be stored in the memory and to be used to check the distance with the testing trials. It is important to note that the covariance matrix here must be a positive definite matrix because the square root can only take a positive value of the inner product. To authenticate person, value of the test data, to the cluster of -th subjects were computed. This procedure was iterated for every testing trial and every subject. The calculated distance, indicating individual variability that can be explained by intra-individual distance (when = ) and inter-individual distance (when ≠ ). A person will be authenticated if D ≤ , where is a pre-specified threshold.

E. Performance Evaluation
To access the performance of the proposed EEG authentication scheme, we employed several widely used evaluation metrics such as Equal Error Rate (EER), False Acceptance Rate (FAR), and False Rejection Rate (FRR). Since authentication is a binary class problem (e.g.: true/false, or accept/reject), it will produce two (2) types of error which are: FAR when an imposter being accepted (false class classified as true); and FRR when a client is being rejected (true class classified as false). However, EER is a point when (FAR = FRR) in a threshold frequency distribution graph, where the value falls under the EER often taken as an optimal point for decision threshold to reject an imposter.

IV. RESULTS AND DISCUSSION
As a result, the calculated intra-and inter-individual distances were denoted by blue and red indicators respectively as shown in Fig. 5. The blue cluster is the collection of intraindividual (self-to-self) distances that need to be treated as a client, as opposed to the red cluster which consisting interindividual (self-to-others) distances that represent imposter. A sliding threshold on the horizontal axis in Fig. 5 able to obtain the FAR and FRR. The combination results in the respective threshold produced the curve as illustrated in Fig. 6, where EER is the intersection point of two (2) curves shown in the graph. Fig. 7 shows the ERR results that tested in different frequency band specification. Ten (10) different combinations of frequency regions were tested in a separated and combined manner to identify the effective band. The results reveal the band selection based on multimodal EEG stimulus task sensitivity. The lower the EER value indicates better authentication performance due to lower false classification rate. From the results, it is quite appealing that the proposed CA approach is effective. Overall, each frequency band specification has a different verification performance. The combined frequency region from the literature (α+low-β [6], δ+θ+α+β [16], θ+α+β+γ [15]) able to authenticate person, but results show there is separated region which able to authenticate individuals more effectively.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 2, 2020 557 | P a g e www.ijacsa.thesai.org The authentication performance is good in the order of lowβ, high-β, β, a + low-β, α, γ, δ+θ+α+β, θ+α+β+γ, θ, and δ frequency band. Generally, features in β region able to provide good verification results. Specifically, it is clear from Fig. 7 that low-β frequency achieving best verification performance which able to authenticate subject well for the random multimodal EEG tasks disregard of task sensitivity. This results in similar to [6] where the simulated driving scenario was the EEG task. The study compares results between α, low-β, and α+low-β, where features in low-β band are providing the best authentication performance. In this study, the best verification performance achieved was is as shown in Fig. 6, which can be formulated by: Besides, when we look only into the separated frequency sub-band (δ, θ, α, β, and γ), the verification results getting better from lower to higher frequency region except for γ. The rising trend can be associated with the brain activity in relevant frequency regions as discussed in Section II.A, where the informative frequency band will be in the higher region as the human mental state changes from deepest relaxation to highly attentive. The computer operating task involved in this study requires human attention, thinking, decision making, cognitive response, and simple motor movement. Thus, we expect the γ band can provide a result in a higher rank, but evidence shows its performance ranked after β and α band. However, this is anticipated due to the tasks that performed during data collection is based on user preferences (voluntary action), thus they are comfortable and relax while engaging in the experiment but not in a highly stressed and unknown situation that will pay higher attention to perform the EEG tasks. Another reason where higher frequency bands able to authenticate subject better is because the EEG multimodal stimuli involved attentive tasks. Thus, frequency in the lower band (δ and θ) will not give better results as compared to the higher region (α, β, and γ).

V. CONCLUSION
This study embarked on the motivation to propose a flexible EEG recording approach for continuous person authentication that able to cater to the task sensitivity for multimodal stimuli. During the data collection experiment, EEG signals are recorded while subject operating a computer and performance random computer tasks such as mouse scrolling, mouse-clicking, keyboard typing, browsing (reading), video watching, music listening, playing computer games, and any other computer-related tasks, based on user preferences. The obtained continuous EEG signals containing inseparable and mismatch cognitive tasks in data trials during training and testing able to authenticate person successfully. We determine to suggest that the low-β band has better separation ability as compared to other frequency band specifications which able to achieve the lowest EER of 7.29%, no matter the task sensitivity for multimodal EEG tasks.
Based on the results, frequency especially located in the middle region was more effective as compared to the lower and higher region. This is because the multimodal stimulus task requires subject attention. Besides, such evidence also anticipated due to the tasks performed during data collection were based on subject voluntary actions, less stress and more pleasant incur the effective frequency region lies in the middle part but not the higher region. Thus, future research may investigate the effect of subject voluntary and involuntary actions on the effective frequency region.