ECG and EEG Pattern Classifications and Dimensionality Reduction with Laplacian Eigenmaps

In this paper, we investigate the effect of dimensionality reduction using Laplacian Eigenmap (LE) in the case of several classes of electroencephalogram (EEG) and electrocardiographic (ECG) signals. Classification results based on a boosting method for EEG signals exhibiting P300 wave and k-nearest neighbour for ECG signals belonging to 8 classes are computed and compared. For EEG signals, the difference between the rate of classification in the original and reduced space with LE is relatively small, only several percent (maximum 10% for the 3 – dimensional space), and the original EEG signals belonging to a 128-dimensional space. This means that, for classification purposes the dimensionality of EEG signals can be reduced without significantly affecting the global and local arrangement of data. Moreover, for EEG signals that are collected at high frequencies, a first stage of data preprocessing can be done by reducing the dimensionality. For ECG signals, for segmentation with and without centering of the R wave, there is a slight decrease in the classification rate at small data sizes. It is found that for an initial dimensionality of 301 the size of the signals can be reduced to 30 without significantly affecting the classification rate. Below this dimension there is a decrease of the classification rate but still the results are very good even for very small dimensions, such as 3. It has been found that the classification results in the reduced space are remarkable close to those obtained for the initial spaces even for small dimensions. Keywords—Laplacian Eigenmaps; dimensionality reduction; biosignals; electrocardiographic signal (ECG); electroencephalogram (EEG)


I. INTRODUCTION
Manifold learning is a class of methods aimed at evidencing low-dimensional manifolds embedded in a highdimensional ambient space. The concept is closely related to dimensionality reduction according to the assumption that for high dimensional spaces, the data is expected to "live" in a (much) lower dimensional space or, in the nonlinear case, on a (much) lower dimensional manifold. In other words, whether linear manifold learning does not result in a good lowdimensional representation of high-dimensional data, it might happen that data lie on or close a nonlinear manifold so that more powerful non-linear dimensionality reduction by preserving the local structure of the input data can be applied. If data stay on a low-dimensional nonlinear manifold, it has been shown that usual methods will adjust automatically, and better learning rates may be obtained even if one understands little about the manifold form [1][2][3][4]. However, even when it is known that data are on a nonlinear manifold there are circumstances when the algorithms fail to recover the manifold [5]. Starting from the above considerations regarding the nature of signals, manifolds and supervised learning, we asked the question that if for a class of real data we can reduce the size of the signals and if a supervised classification obtains similar results on the real, original data space and on the reduced space [6].
In last years, manifold learning methods have grown explosively [17][18][19]. A classification from the point of view of preserving the geometry, the methods of manifold learning can be classified into two broad categories, namely: a) Methods with preserving the local geometry structure: locally linear embedding (LLE) [7], Laplacian eigenmaps (LE) [1], manifold charting (MC) [8], Hessian locally linear embedding (HLLE) [9]. b) Methods with preserve the global characteristics: isometric mapping (ISOMAP) [10], diffusion map [11] The LE algorithm has been initially applied on real signals in the medical field. Without a thorough analysis, in 2007 it was tested by Gramfortin and Clerc [12] on MRI images and signed EEG. Lashgari and Demircan in 2017 [13] used the LE algorithm in Electromyography (EMG) signal classification problems.
For medical signals such as ECG and EEG, in 2016 Erem et al. [14] presents the Laplacian Eigenmaps machine learning algorithm combined with dynamical systems ideas for analyze emerging dynamic behaviours.
The method chosen in this paper for dimensionality reduction of electroencephalogram (EEG) and electrocardiographic (ECG) signals is the Laplacian Eigenmap [1]. The outcomes reported here extend our previous results published in [15 -16], where the performances of the LE algorithm were tested only on ECG time signals and where a comparative analysis between the LE and LPP (Locality Preserving Projections) algorithms was done. Here we propose a more rigorous analysis of the results obtained with LE for both ECG and EEG signals. These two classes of signals were chosen since they are also the most used 1D signal in the field of bio signal processing.
In order to evaluate the effect of dimensionality reduction in both cases, EEG and ECG, we compare the classification rates obtained with the original data with those obtained on the EEG and ECG segments on which various degree of dimensionality reduction were obtained using Laplacian Eigenmaps (LE). www.ijacsa.thesai.org Next, we will analyze the effect of reducing the dimensionality of the data. For this we will calculate the classification rate in the initial space and the classification rate in the reduced space. If the two classification rates are close, it means that close neighbours remain close, meaning the geometry is preserved, at least the local geometry. For this we will use two types of signals, namely, ECG and EEG signals. For each signal type we will choose a classification problem specific to this one with which we have worked and we have obtained good results. Then we will reduce the dimensionality of the signals and using the same classifier we will compare the classification rates obtained in the initial space and those obtained in the reduced space. In Section II the theoretical part of the Laplacian Eigenmaps algorithm is presented, in Section III we will present the segmentation method and the classifier chosen for EEG signals (EEG signal acquired by Hoffmann and collaborators in their laboratory and the Gradient boosting classifier) and for type signals. ECG (MIT-BIH Arrhythmia database and segmentation with / without R wave centring and a KNN classifier with Euclidean distance and the nearest neighbour membership decision).

II. LAPLACIAN EIGENMAPS
The target of the LE algorithm is to find a low-dimensional data representation but to conserve the local geometry of the data. This preservation of the geometry is based on the distances between the pairs of near neighbours on the manifold.
The LE algorithm associates the data with a graph with weights. These weights are calculated based on the distances between neighbours. The weights thus found are used to minimize a cost function that finds a mapping from the initial data to a small dimensional space [1] [13][14].
The explanation of the weights calculated based on the neighbourhoods is that the distance in the low-dimensional data representation between a data point and its first nearest neighbour contributes more to the cost function compared to the distance between the data point and its second or the other nearest neighbour. The minimization of the cost function is defined as an eigenproblem [6].
The LE algorithm [1] construct a neighbourhood graph G in which every data point xi is connected by an edge to its k nearest neighbours. In our case, for all points xi and xj in G that are connected by an edge within a neighbourhood Ni, a weight is computed using the Gaussian kernel function, where σ is a constant called heat kernel parameter, leading to a sparse matrix W that is symmetric adjacency. It is desired that points xi, xj that are close to the initial spatial map are mapped to points yi, yj to remain close and in the small space. This can be achieved by minimizing the cost function where large weights wij correspond to small distances between the high-dimensional data points xi and xj. Hence, the difference between their low-dimensional representations yi and yj highly contributes to the cost function. As a result, the close points of the high-dimensional space are placed as close as possible in the low-dimensional space [1][2].
Then follows the last stage of the LE algorithm, namely, the calculation of eigenvalues and eigenvalues for the general eigenvector problem, where D = (dij) is an (n×n) diagonal matrix with elements and matrix L is calculated based on matrices G and D, namely, L = D − W is the Laplacian matrix which is symmetric and positive semidefinite. The L matrix can be thought of as an operator on functions defined on the vertices of G.
Mapping in the low-dimensional space is done by eliminating the eigenvector f0 corresponding to eigenvalue 0 and using the next m eigenvectors corresponding to the next eigenvalue. The embedding in an m dimensional Euclidean space is: where f0, . . . , fk−1 are the solutions of equation (1), in ascending order of their eigenvalues [1].

III. EXPERIMENTAL RESULTS
In what follows we will present several classification results for EEG and ECG signals seen only as a measure of the conservation of the spatial geometry on manifolds and not of the quality of the classifier. In other words we will use the classification rate as a measure of preserving geometry, i.e. find how much the classification rate decreases when reducing the space dimension with the LE algorithm.

A. EEG Signals
Starting from the results obtained in our paper [20], in which we used the EEG signal to verify the preservation of the neighbourhoods in the reduced space with compressed sensed (CS), using the same test data we check if the reduced dimensionality data with LE keeps its neighbours.
In paper [20] we used compressed sensed algorithm to reduce the EEG data size. The common point of the paper [22] with the present paper is that the same EEG data is used to test the methods (in fact the same EEG database) and the same classifier, namely gradient boosting. The difference between these papers is that the method of decreasing the dimensionality of the data is distinct.
For testing the method there were used EEG signals acquired by Hoffmann and collaborators in their laboratory -a reduced database is available on the internet at [21]. The database includes EEG signals collected for 32 channels, which are grouped in 942 vectors for classification and lasting 1 sec each. The Gradient boosting classifier from [22] was used. It (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 3, 2020 44 | P a g e www.ijacsa.thesai.org should be noted that the used software was developed by the authors as a machine learning method and creates a powerful algorithm from several poor classifiers.
In the above work, the authors described a simple and powerful method to detect the P300 from single EEG trials which have been used to build a P300 based spelling device for BCI. To compute from training data a function that detects P300s from single EEG trials, boosting has been used to stepwise maximize the Bernoulli log-likelihood of a logistic regression model.
We mention that we kept the configuration parameters for gradient boosting method were kept the same as in [22]. Thus the maximal number of iterations is Mmax = 200, the best M was 30×10 cross-validation loop, and  = 0.05(same setting as in [22]). The results are presented in Table I and, in more detail, in Fig. 1 and Fig. 2. Fig. 1 shows three EEG signals in the 128 dimensional space and their mapping on the space spanned by the first 30 eigenvectors. It happens that with the reduction of the spatial dimensions the signal waveforms change, but the relative distances are preserved as it will be illustrated in Fig. 3.   Table I shows a very small difference in the classification between original EEGs -128 dimensional space and EEG with 30 -dimensional space -the classification rate decreases by only 1 or 2 percent. The decrease of the classification rate from 30 to 15 dimensional space is also about 1%. Fig. 2 show the accuracy obtained after the cross-validation loop for configurations with 23, 8 and 4 channels for original EEGs and for 3, 5 or 15-dimensional spaces obtained with LE. As it can be seen, the gradient boosting algorithm converges to an optimal solution. The difference between the rate of classification in the original and reduced space with LE is relatively small, only several percents (maximum 10% for the 3dimensional space), and the original EEG signals belonging to a 128-dimensional space. This result confirms that the global data structure is preserved and that a classification can be made in the small space with results very close to the classification in the initial space. This result is kept regardless of the channel configuration.
Remarkably, the classification rate decreases very little with the reduction of the signal space and that the trend of evolution according to the number of iterations is kept the same for all space dimension. Another interesting result is that the sigma parameter in the Gaussian distribution has almost no influence on classification rate performances as shown in Table II where it can be seen that the classification rate is slightly affected with the modification of sigma, the maximum difference being 3%. To make an intuitive image on data, in Fig. 3 we present two examples of EEG signal data in reduced spaces with 3 and 2 dimensions using the LE algorithm.

B. ECG Signals
In the case of ECG signals, the starting point are the results presented in [15] where the results obtained with Laplacian Eigenmaps (LE) and Locality Preserving Projections(LPP) are analyzed and compared to reduce the dimensionality of the signal space. In [15] it was found that for small sizes LE offers better results. In this paper, we analyze whether the centring of the R wave brings significant improvements for very lowdimensional space (such as 2D and 3D).
For ECG signals, we have used 44 ECG from the MIT-BIH Arrhythmia database. The ECG signal was acquired at a sampling frequency of 360Hz, with 11 bits / sample [23]. In addition to the ECG signals, the database also comprises annotation files with the index of the R wave and the class for each ECG beats. In the database were identified 8 major classes of pathologies (from which 7 classes of pathological beats.
We used two different methods of segmenting ECG signals, namely:  Segmentation with re-sampling (301 samples per signal)  Segmentation with re-sampling as above and R waves centred.
a) Segmentation with re-sampling: A cardiac beat begins in the middle of the RR interval and ends in the middle of the next RR interval.
b) Segmentation with re-sampling and R waves centred: For the second splitting up method, to increase the classification rate we used the method reported in [24], namely, starting with ECG signals for which the position of the R-wave has been exactly determined. A cardiac beat begins in the middle of the RR interval and ends in the middle of the next RR interval as before and in the cardiac beats thus obtained, the R wave will be positioned in the middle by resampling the waveforms on both sides of R. In this way patterns with the centred cardiac R wave have been obtained. In this case, all cardiac patterns are of size 301 as before, the R wave being positioned on the 150th sample.
The database thus constructed contains 5608 patterns, each class having 700 such patterns (7 pathological and 1 normal). The results are presented in Table III and Fig. 4.
For classification, the KNN classifier with Euclidean distance and the membership decision was based on the nearest neighbour was used. Table III shows a small difference in the classification rate of original cardiac patterns -301 dimensional space and cardiac patterns with 30 -dimensional space. The classification rate decreases by approximately 3 percent. The decrease of the classification rate from 30 to 15 dimensional space is only 1%. These proportions are similar no matters if the R wave is centred or not. www.ijacsa.thesai.org  In Fig. 5, we present classification rates vs. space dimension for LE (for sigma = 5 and neighbourhood k = 9) for ECG segments without R wave centred (blue) and segmentation with R wave centred (red). It can be observed that there is a slight decrease in the classification rate for both original signals (in 301 dimensional space) and in the reduced space. Thus, for the original signals a 90.36% classification rate is obtained if there is no R wave centring compared to 92.5% for segmentation with centred R wave. In the above conditions, for the initial ECG signals the classification error for the 8 classes was found to be 2%, this small difference being significantly the result of the R-wave centring.
Because LE offers very good results for the very small size of the space, the method can be used for data represented in 2D or 3D to give us a visual idea of the spatial distribution of data in classes. This visualization can be very useful to understand the spatial arrangement of some data, an arrangement that can sometimes be very twisted and the choice of the classifier or some parameters of the classifier is related to the spatial arrangement of the data.
In Fig. 6(a and b) the ECG signal data in 3D (normal and zoom for the central zone) mapping are shown.  For EEG signals it has been found that the gradient boosting algorithm converges to an optimal solution. The difference between the rate of classification in the original and reduced space with LE is relatively small, only several percent (maximum 10% for the 3dimensional spaces), and the original EEG signals belonging to a 128-dimensional space. This means that, for classification purposes the dimensionality of EEG signals can be reduced without significantly affecting the global and local arrangement of data. Moreover, for EEG signals that are collected at high frequencies, a first stage of data pre-processing can be done by reducing the dimensionality. Another observation for EEG signals is that the classification rate decreases very little with the reduction of the signal space and that the trend of evolution according to the number of iterations is the same for all space dimensions. It is also observed that the sigma parameter in the Gaussian distribution has almost no influence on classification rate performances.
For ECG signals, for segmentation with and without centring of the R wave, there is a slight decrease in the classification rate at small data sizes. It is found that for an initial dimensionality of 301 the size of the signals can be reduced to 30 without significantly affecting the classification rate. Below this dimension there is a decrease of the classification rate but still the results are very good even for very small dimensions, such as 3 (classification rate decreases from 92.33%% for initial ECG signals with 301 dimensionality to 89.32% for dimensionality 3).
At present, we are not aware of studies similar to the application of the LE algorithm for both EEG and ECG time signals thus a comparison of our results obtained with this algorithm with other authors is not possible. However, below we present in Table IV with results reported in [25] with PCA, LDA, KPCA, Isomap and LE only for ECG signals. It can be observed in Table IV that our results with LE in spaces with reduced dimensionalities are similar with the observation that they were not obtained on the same database.

V. CONCLUSIONS
The remarkable result reported in this paper is the fact that dimensionality reduction for EEG and ECG signals using LE does not affect significantly the classification rate even for rather small dimensions. This proves not only that the neighbourhoods are preserved by LE but also that the signals have a significant robustness regarding classification when mapped on low dimensional manifolds. This allows having an intuitive image of the spatial distribution for the case of 2D or 3D when it is possible to plot the data.
In the future, we aim to use the advantages offered by the LE algorithm for classification problems and to find solutions for new data (i.e. so that we would not need a new recalculation whenever we have a new data).