Classiﬁcation of Common and Uncommon Tones by P300 Feature Extraction and Identiﬁcation of Accurate P300 Wave by Machine Learning Algorithms

—An event-related potential (ERP) is a measure of brain response to a speciﬁc sensory, cognitive, or motor event. One common ERP technique used in cognition research is the oddball paradigm where the brain’s response to common and uncommon stimuli is compared. The neurologic response to the oddball paradigm produces a P300 ERP which is one of the major visual/auditory sensory ERP components. The purpose of this study to classify ERP responses to common and uncommon tones by extracting the P300 feature from ERP epochs and identify the accurate shape of the P300 wave. For recording ERP data, and OpenBCI system is used. P300 features are extracted using EEGlab which is a mathematical tool of MATLAB. Finally, various types of machine learning models are used for identifying the accurate shape of a P300 wave and then classifying common and uncommon auditory tones. For stimuli classiﬁcation, all of the algorithms evaluated performed efﬁciently and built a consistent model with 93.75% to 99.1% evaluation accuracy. Also, for P300 shape detection, NN model showed the best performance with 94.95% accuracy. These ﬁndings have the potential to add useful machine learning-based methods to the clinical application of ERPs.


I. INTRODUCTION
Electroencephalography (EEG) is a non-invasive monitoring method that tracks and records the neural activities of the brain. The time-locked activities of EEG are known as Event-Related Potential (ERP) [1]. ERP research has provided significant insights into our understanding of many neurologic functions including cognition, affection, and clinical conditions such as schizophrenia [2]. ERP analysis can help to identify sleep disorders, changes in behavior, diagnose and monitor seizure disorders, and has even been used to evaluate brain activity after a severe head injury or before a heart or liver transplant surgeries [3]. A much-studied ERP component is the P300 that is formed as a component of recognition when the brain responds to a series of stimuli that include a common (or frequent) stimulus and an uncommon (infrequent) stimulus. The P300 ERP is characterized by a large positive peak occurring at approximately 300 ms-600 ms after stimulus onsets and is found prominently over parietal region [4]. Besides, one of the major applications of ERP technology is based on using the P300 wave to implement a Brain-Computer Interface (BCI) that can incapacitate people by offering various ways of communicating with the external world. For example, P300 has been used to implement communication with devices, using mobile messages, playing games, and many more as described in [5,6]. In our work, we have used auditory stimuli, which are also suitable for individuals who cannot receive or react to visual stimuli.
In this study, a passive paradigm has been used to stimulate P300. Here, subjects would only concentrate on the target stimuli without responses [7] and have to ignore common stimuli. Two audio stimuli with 1000Hz and 2000 Hz, were designed as the common and uncommon stimuli. The duration for any stimuli was 180 ms and the internal interval between two consecutive tones was 3500 ms (Figure 1). In this study, we will detect the target and non-target ERPs by oddball paradigm and will extract the features of component P300 (P3).For uncommon stimuli, ERP peak higher than common stimuli( Figure 2). The features are power (P), energy (E), mean of the amplitude, wavelength, and the number of events. Using these features we will classify the auditory stimuli by using a machine learning technique.
Classification of common and uncommon tones is an important step for using ERPs in the practical field of cognitive research. The typical signal classification includes filtering, artifact removal, extracting data epochs, and many other steps. All these steps make ERPs suitable for use by machine learning [8,9]. There are various types of machine learning algorithms that have been applied to the classification of ERPs. In our study, six types of machine learning algorithms have been used for the classification of tones and identifying accurate P300 shape. They are Neural Networks (NNs), k-Nearest Neighbors algorithm (k-NN), Decision Tree algorithm (DT), Random Forest classifier (RF), Logistic Regression (LR), and Support Vector Machines (SVMs). All the models performed efficiently. For tone classification, the performance is between 93.75% and ,99.10% and RF performed with the most efficiency. Also, for "Accurate P300 plot "identification, NN performance is best with 94.95%. Therefore, this paper trains and tests different types of machine learning methods for the classification of common and uncommon tones by extracting the P300 feature from ERP epochs and then identify the accurate shape of the P300 wave.
The paper is organized as follows: In Section 2, there are literature reviews. In Section 3, a brief description of the data set and recording procedure. EEG signals pre-processing are described in Section 4. Machine learning implementation with tone classification and P300 plot identification are described in Sections 5 and 6. In Section 7, the Conclusion and suggested future work are provided.

II. LITERATURE REVIEW
In [10], Amin classified EEG signals based on the pattern recognition approach. There they used classifiers such as Knearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB). Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. Besides, the proposed approach was also applied to the public dataset for the classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN.
In [11], Joshi classifies P300 using LSTM and deep learning. There, they proposed a neural network model based on Convolutional Long Short Term Memory (ConvLSTM) for single-trial P300 classification. Their proposed method outperforms previous CNN based approaches on raw EEG signals. The approaches were evaluated on publicly available data-set II of BCI competition III. Another dataset was recorded locally using audio beeps as stimuli to validate these approaches.
In [12], Cecotti presented a method for the detection of P300 waves. This model is based on a Convolutional Neural Network (CNN). The topology of the network is adapted to the detection of P300 waves in the time domain. Seven classifiers based on the CNN have proposed: four single classifiers with different features set and three multi classifiers. These models are tested and compared on the Data set II of the third BCI competition.
In [13], Alomari proposed an automated computer platform to classify Electroencephalography (EEG) signals associated with the left and right-hand movements using a hybrid system that uses advanced feature extraction techniques and machine learning algorithms. The datasets were inputted into two machine-learning algorithms: Neural Networks (NNs) and Support Vector Machines (SVMs). The research showed that the method of feature extraction holds some promise for the classification of various pairs of motor movements, which can be used in a BCI context to mentally control a computer or machine.

A. Experimental Setup
The data set was collected from 28 subjects (both male and female) in an age group of 18 to 43 years. All participants had no history of neurological or psychiatric conditions and were healthy, with no hearing and visual impairment. The 8-channel EEG signals were recorded according to the international 10-20 system (excluding some electrodes) along the surface of the scalp in the OpenBCI GUI. The Ultracortex Mark IV headset ( Fig. 4; Left) was used for recording brain activity. PIC24 microcontroller and macromedia board were used to generate tones. The program was coded in MPLAB X IDE in C language. GPIO pin 1 and 15 were used for detecting deviant and target stimuli. OpenBCI cyton board has 5 digital input-output (IO) pins to read from D11, D12, D13, D17, and D18. We connected D17 and D18 to detect the events from the microcontroller. There was a standard noise-free soundbox to hear the sound. In Fig. 3, the experimental setup for the experiment is shown. Subjects were seated in a comfortable chair and instructed to try to avoid fatigue. Also asked to keep the eyes closed and do minimal muscle movement during the recording. The subject heard a series of tones with two different pitch. One tone is known as "Frequent or non-target (1000Hz)" and the other is "Infrequent or target (2000Hz)". The subject needs to concentrate on tones and mentally count the uncommon tones. After completion of one session, the subject relaxed for 5-10 mins and then again started the second session. The recording duration was for 3 to 4 minutes and each subject tested for 2-3 times. There were 6 to 12 targets and 24 to 50 non-target tones.

A. Channel Used
OpenBCI cyton board has 8 channels for measuring brain EEG and by the use of a daisy board, it can be extended to 16 channels. For our auditory EEG experiment, we used 8 channels, cyton board. The channel names are Fp1, Fp2, C3, C4, P7, P8, O1, and O2. The position of the channel is shown on the head plot in Figure 4(Right). The reference and ground we used were two ears.

B. Filtering
The EEG signals are very much affected by contamination, mainly bidirectional. It is very much noisy and un-stationary. For example, EEG recordings are contaminated as the results of eye movement and blinking (originating mostly from the frontal and lateral frontal areas) [14]. So, filtering the EEG signal is required to get rid of unnecessary information from the raw signal. The signals were sampled at 250 Hz. In MATLAB, there is an interactive toolbox, EEGLAB. EEGLAB was used to filter and all other offline calculations. We used a band-pass filter from 0.5 to 30 Hz for removing the DC effect and minimizing artifacts at epoch boundaries. We also applied a notch filter of 60 Hz.

C. Artifact Rejection and Epoch Extraction
For eye blinks and horizontal eye movements' correction, independent component analysis (ICA) was conducted in offline by using EEGLAB. RUNICA routine was used for ICA. One to four eye blinks were marked per participants. Also, if the peak-to-peak voltage was greater than 400 mv then those trials were omitted in any channel.
The signals were sampled to 250 Hz in offline. After filtering and AR, the continuous EEG data were epoched by extracting data epochs computed with a 2000 ms. Epoch started 200ms before and 1.8 sec after the stimulus onset.

A. Features Extraction
As described in section three, data is recorded in CSV format and then by the use of MATLAB, feature vectors were calculated for each of the resulted in rare and frequent epochs. The EEG signals are in microvolt (uV) ranges and we extracted the mean of amplitude (uV), power (uW-microwatt), energy (nJ-nano Jule), wavelength (mV), and no of odd events. For each subject, there were 5 input feature vectors and two target matrix (rare and frequent tone). The constructed features were represented in a numerical format that is suitable for use with machine learning algorithms. Every column in the features matrices was normalized between 0 and 1.

B. Machine-Learning Algorithms
In this work, we classify our auditory data set with 6 types of machine learning algorithms. They are K-Nearest Neighbors algorithm (KNN), Neural Network (NN), Decision Tree algorithm(DT), Random Forest classifier(RF), Support Vector Machines (SVMs), and Logistic Regression(LR). A detailed description of these learning algorithms can be found in Jupyter notebooks also known as ipython notebooks were used for training and testing all kinds of machine learning classification models. A brief descriptions of models are given on the next page:

1) Neural Network (NN):
A sequence of an algorithm that is used to recognize the underlying relationship of a data process is known as a neural network. It can replicate the way the human brain operates. All the learning takes place in input, hidden, and active layers. There are countless weights(neurons) inside a hidden layer [15,16]. Every layer is connected through (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 10, 2020 an activation function, to estimate the performance of the learning phase, a loss function is used and for the improvement of learning, an optimizer is used ( Fig. 5; left).

2) K-Nearest Neighbors Algorithm (k-NN):
The k-NN algorithm is a simple, supervised machine learning algorithm that can be used to solve both classification and regression problems. k-nearest neighbors have only one parameter -the number of neighbors (k) to be included in deciding on the majority-vote predicted classification. In Fig. 5 (right), the test sample (red star) should be classified either as to yellow circle or to purple circle. If k = 3 it is assigned to Class B ( as there are two purple circles) and if k = 6 it is assigned to Class A (as there are four yellow circles) [17]. 3) Support Vector Machines (SVMs): SVM is a method that fits the provided data, returns a "best fit" hyperplane that divides or categorizes the data. Then some features are feed to the classifier to see what the "predicted" class is. It is a supervised learning model ( Fig. 6; left). M is a regularization parameter that controls the trade-off between achieving a low training error and a low testing error that is the ability to generalize the classifier to unseen data. In SVM, the hinge loss is a loss function used for training classifiers [18].

4) Logistic Regression (LR):
Logistic regression (LR) is a statistical model that uses a logistic function to model a binary dependent variable. In LR, a threshold value is specified and it at what value the data will be grouped in one class vs. other class ( Fig. 6; right). It is best suited for binary classification but can be applied in the classification problem with more than two variables or groups [19].

5) Random Forest (RF):
Random forest is a supervised learning algorithm that is used for both classifications as well as regression. It is mostly used for classification problems. A random forest algorithm creates decision trees on data samples and then gets the prediction from each of them and finally selects the best solution utilizing voting ( Fig. 7; right). It is an ensemble method that is better than a single decision tree because it reduces the over-fitting by averaging the result. The loss function is the Gini impurity. The training loss is often called the "objective function" as well. Validation loss. This is the function that we use to evaluate the performance of our trained model on unseen data [20].

C. Performance Analysis
In NN, for the hidden layer, Relu ( Rectified linear unit) is used. The function gives a zero for all negative values. For defining the target, the softmax activation function is used. We have used "sparse-categorical cross-entropy" as a loss function. It can measure the dissimilarity between the distribution of observed class labels and the predicted probabilities of class membership. We have used "Adam" as an optimizer. The algorithm can handle sparse gradients on noisy problems.In k-NN and SVM, we have used the range of parameter k = (1,31) and in SVM, range= (1,100).For DT, sample split range= (2,30).For all the algorithms, the data sample train and test ratio were 8:2.

1) Performance Analysis of Random EEG and Auditory
Stimuli: In Table-I, the classification accuracy of random EEG (not auditory), auditory stimuli for common and uncommon tone is given. Here, we separated the random EEG from common and uncommon events. All the algorithms performed efficiently. Random Forest(RF) algorithms efficiency was 99.03% and other algorithms showed 97.91% except for Logistic Regression(LR). Overall, we can conclude that all the model was trained successfully and test accuracy was also remarkable.
2) Performance Analysis of random EEG and auditory stimuli: In Table II, the classification accuracy of auditory stimuli for common and uncommon tone is given. All the models performed with excellent accuracy, from 93.75% to

D. ROC Curve and AOC
A ROC curve (receiver operating characteristic curve) is a graph which shows the performance of a classification model at all classification thresholds. This curve plots two parameters: True Positive Rate(TPR) on the y-axis & False Positive Rate(FPR) on the x-axis. As a baseline, a random classifier is expected to give points lying along the diagonal (FPR = TPR). The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test with the threshold values (0.5, 1, and 1). AUC stands for "Area under the ROC Curve". That is, AUC measures the entire two-dimensional area underneath the entire ROC curve. In general, an AUC of 0.5 suggests no discrimination, 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding [22,23]. As described in Section I, the P300 response occurs at around 300ms in the oddball paradigm, regardless of the type of stimulus presented: visual, tactile, auditory, olfactory, gustatory, etc. Because of this general in-variance about stimulus type, the P300 component is understood to reflect a higher cognitive response to unexpected and/or cognitively salient stimuli. For detecting a good shape P300, we need to detect other components also, like N100 (N1), P200 (P2), N200 (N2), etc. Also, P300 (P3) must give a positive peak of around 300 ms.
We used EEGLab for plotting P300 which is a great tool for MATLAB.To extract the P300 plot, we need to create 6-8 files in EEGLab and then could draw the P300 plot. Then for classification, we used Python Jupiter notebook. Which is very user friendly and the fastest procedure. We have grouped all the images in two-class, "P300" and "no-P300". The plots which matched with Fig. 10 classified as "P300" and the remaining were classified as "no P300".

A. Performance Analysis
We train and tested image by the method of Neural Networks (NNs), K-Nearest Neighbors algorithm (KNN), Decision Tree algorithm (DT), Random Forest classifier (RF), and Support Vector Machines (SVMs). Amid all of the models, NN performed best with an evaluation accuracy of 94.95%. RF and DT also have good accuracy (83.94% and 76.78%). In Table  III, the image classification accuracy percentage (%) is given.  Fig. 11 shows the ROC curves of the DT, RF, k-NN, and SV and NN classifiers for the detection of "accurate P300 wave" with area under the curve (AUC) values of 0.75, 0.79, 0.72, 0.50 ,and 0.99 respectively. These results indicate that the NN (Neural Networking) method shows outstanding performance among all the methods. SV classifier performance is not satisfactory at all where the other three methods performances also satisfactory. Fig. 11. Receiver Operating Characteristic for Detection of Accurate P300 Plot. (An AUC of 0.5 Suggests no Discrimination (i.e., Ability to Diagnose Patients with and without the Disease or Condition based on the Test), 0.7 to 0.8 is considered Acceptable, 0.8 to 0.9 is considered Excellent, and more than 0.9 is considered Outstanding.)

VII. CONCLUSION
The study explores the classification of common and uncommon tones of RP signals for auditory stimuli and identification P300 wave in accurate shape based on P300 feature extraction. The study train and tested various types of machine learning Our experiment has three phases. At first, we trained and test a model to separate auditory EEG signals from random EEGs [ Table I]. All the models performed efficiently (89.58% -99.03%). NN and DT showed a maximum accuracy of 99.03%. Then, classify common and uncommon tones [ Table II]. There the classification accuracy was from 93.75% to 99.1%. Where LR showed 96.78% and RF showed 99.1%. Finally, identified the accurate P300 plots from distracted or non-P300 plots. The classification accuracy for all the models showed different accuracy. Among them, NN performed best with 94.95% and accuracy for RF 83.92%, DT 76.78%, k-NN 76.2%, and SVM with accuracy 66.08%. Overall, the audio-based P300 classification model showed outstanding performance and is comparable to today's foremost BCI research. Our experimental methods are simple but consistent and proved better performance for the research in neuroscience and machine learning.
In the future, classification experiments in virtual reality may study with external audio-visual distractions and may investigate their effect on various ERP components. It may show a huge contribution to ERP research.