fMRI Data Analysis Using Dempster-Shafer Method with Estimating Voxel Selectivity by Belief Measure

In the functional Magnetic Resonance Imaging (fMRI) data analysis, detecting the activated voxels is a challenging research problem where the existing methods have shown some limits. We propose a new method wherein brain mapping is done based on Dempster-Shafer theory of evidence (DS) that is a useful method in uncertain representation analysis. Dempster-Shafer allows finding the activated regions by checking the activated voxels in fMRI data. The activated brain areas related to a given stimulus are detected by using a belief measure as a metric for evaluating activated voxels. To test the performance of the proposed method, artificial and real auditory data have been employed. The comparison of the introduced method with the t-test and GLM method has clearly shown that the proposed method can provide a higher correct detection of activated voxels. Keywords—Dempster-Shafer theory; fMRI; GLM; t-test; HRF; OTSU method


INTRODUCTION
Understanding the way the brain works depends on studying the functional Magnetic Resonance Imaging (fMRI).The study of fMRI time series is related to the activity of neurons in response to an input stimulus during an experiment.However, it would be difficult to notice this activation because it occurs in milliseconds, and it is influenced by noise.To overcome this problem, a contrast method, known as Blood Oxygen Level Dependent, denoted as BOLD for short, is proposed by Ogawa et al. [1] given that the metabolic process increases blood flow and volume in the activated areas during brain activity that are characterized by hemodynamic response functions (HRF).In other words, a local increase in blood flow leads to a different amount of oxygen consumption.The increase in local blood flow permits to map the changes in the areas where oxygen concentrates on the entire brain.
Given this fact, several attempts have been made to gain a better precise classification of voxels regarding activation.Mainly, fMRI data analysis approaches can be classified into two main groups.The first one is the hypothesis-based methods, such as the General Linear Model (GLM) [2].The other one is the data-driven technique including clustering, Principal Component Analysis (PCA) and Independent Component Analysis (ICA).This latter has proven to provide a practical method for the exploratory analysis of temporal [3] and spatial [4] fMRI data and it is becoming among the useful methods over the last decade [5].On another plan, the dimensionality reduction methods have shown their ability in fMRI data analysis [6] since they have been used to reduce some voxels surrounding the brain.Thus, to enhance the detection of activated voxels, it is recommended to perform dimensionality reduction before applying one of the methods mentioned above.
The hypothesis-based approaches have an essential role to play in the analysis of fMRI data because of their complex spatial and temporal correlation structure [3,4].Recently, many developed approaches have relied on the general linear model (GLM).The latter has become the famous classical modeldriven analysis and an excellent technique for analyzing fMRI data.In fact, GLM intends to spot functionally active brain regions and to characterize both functional anatomy and changes resulting from certain diseases [7,8] because it needs prior knowledge and assumes a canonical hemodynamic response function (HRF) to model a BOLD signal.The main limitation that GLM method suffers from is that it ignores correlation between voxels in both time and space dimensions that are present in fMRI times series [9].
Clustering fMRI time series has emerged as a possible way to detect brain activity.It has been used as an exploratory data analysis technique in fMRI time series.In this case, several approaches based on c-means have emerged [10] such as spatiotemporal clustering analysis of fMRI data [11] and Fuzzy clustering analysis (FCA) [12].Therefore, hierarchical clustering analysis (HCA) [13] has been gained its place with its ability to produce connectivity map in fMRI data.
Recently, clustering time series [14,15] has newly concerned a considrable amount of research.They have shown their efficiency because they form partitions based on similarity of voxel values in the fMRI time series where each partition is represented by the cluster centroid that is sufficient for the analysis and investigation of fMRI time series [16].
Therefore, clustering techniques can be used to separate activated voxels from non-activated ones.However, methods based on clustering for fMRI data analysis suffer from limitations and relevant problems that need being addressed.Their main drawback lies on their reliability where it is established only with the number of iterations and repeated runs [17].
Accordingly, several methods based on probability theory of Bayesian spatiotemporal model approaches that incorporate spatial correlation among brain responses have been recently found as successful applications in the analysis of fMRI data [18] [19].An alternative method has been recently proposed by Bowman et al. [20] Gaussian Markov random field priors on the regression coefficients of a general linear model.Another research has investigated spatiotemporal modalities by hierarchical Bayesian approaches and incorporated the estimation of the HRF [21].Another work to parameterize the HRF with voxel-varying parameters used Gaussian Markov a random field prior on the activation characteristic parameters of the voxels has been introduced by Quirós et al. [22].All these works assumed an independent structure for the error terms in their models.
In this study, a new approach based on Dempster-Shafer theory of evidence is suggested to improve extracting brain activity.Particularly, the emphasis of this work has been placed on developing a new analytical framework that can provide better detection of the brain regions that show signs of neuronal activity following a stimulus.Also, it aims to infer the association of spatially remote voxels that exhibit fMRI time series with similar characteristics.In the formulation that we adopt, the stimulus pattern is convolved with a hemodynamic response function (HRF).A voxel-dependent shape parameter characterizes the delay between the onset of the stimulus and the arrival of blood to the activated brain regions.Clustering the time course responses via a Dempster-Shafer process is a further feature of the proposed approach.
The rest of the paper is organized as follows: In section 2, a brief simplified description of the hemodynamic response model is given.Then, section three describes the basics of the Dempster-Shafer theory.Afterward, we put the proposed method in details.The conducted experiments on simulation and real data are clearly discussed with the findings of the research in section five.At last, a conclusion is given.

II. HEMODYNAMIC RESPONSE FUNCTION
It is worth to note that BOLD fMRI does not measure the activity of neurons directly.Instead, it measures the metabolic demands of active neurons.Given this fact, to gain a better understanding to neural activity is waiting for further study.Thus, we need to model the evoked fMRI response realized by the so-called hemodynamic response function that is a nonlinear function.In other words, we have to model the BOLD response into an impulse input.The box-car standard, the Gaussians and the canonical model proposed in [2,23] are some of the several HRF models that have been developed.They have an essential role to play in characterizing the onset of the stimulus.This work focuses on the study of the canonical HRF model.As presented in "Fig.1", this model is divided into two parts.The first part describes the Peak whereas the second one is employed to model the Undershoot.A good model for the canonical HRF is obtained by the function whose peak is situated between 4-6 seconds [24].The relationship between the stimulus and BOLD response, denoted by y(t), is typically modeled as the convolution of the stimulus function with an impulse response( HRF) as presented in the following equation: Where h(t), y(t) and s(t) denote HRF , the result and the unprocessed fMRI signal respectively The convolution result is known as epochs in SPM (Statistical Parametric Mapping) [9].The canonical HRF performs well in many experiments.However, some activated voxels are ignored because the real HRF varies in different people and in different brain regions of the same person as well [25].To address this problem, this work introduces a new framework based on the Basic Belief Assignment probability as will be closely illustrated.

III. THE DEMPSTER-SHAFER THEORY OF EVIDENCE
In the following, we introduce the fundamentals of the Dempster-Shafer (DS) theory of belief function that has been proven to be an efficient tool in representing uncertain knowledge.This theory has paved the way for many researchers to study various aspects related to uncertainty and lack of knowledge and has shown its ability to solve real problems [26].In fact, Dempster-Shafer theory can be considered as a generalization of the probability theory [27].
The references [28] [29], [30], [31] provide further information about this theory.In what follows closely, a brief introduction to the basic notions of the theory of evidence is given.
Let θ = {θ 1 , θ 2 , … ., θ k } be a finite set of possible hypotheses.This set is referred to as the frame of discernment, and its power set is denoted by 2 θ where: A key point of the evidence theory is known as Basic Belief Assignment (BBA) [19].It is defined as: A basic belief assignment m is a function that assigns a value in [0, 1] to every subset Ai of ʘ and satisfies the following: The BBA (m) is associated with the belief function, denoted by bel ().The definition of belief function is given as in [19].A belief function assigns a value in [0, 1] to every nonempty subset D of ʘ.It is called degree of belief in D and is defined by

Undershoot Brief Stimulus Peak
The function, pl (.), associated with the BBA m(.) is a function that assigns a value in [0, 1] to every nonempty subset D of ʘ.It is called "degree of plausibility in D" and is defined by Ai∩D≠⊘ (4) Furthermore, a BBA can also be viewed as determining a set of probability distributions over ʘ so that bel(A) ≤ P(A) ≤ pl(A).It can be easily seen that these two measures are related to each other as follows: Therefore, one needs to know only one of the three values of m, bel, or pl to derive the two other ones, where A � stands for the negation of a hypothesis A shown in "Fig.2" Fig. 2. Basics measures of Dempster-Shafer Theory of Evidence [19] IV.PROPOSED METHOD

A. Overview
Figure "Fig.3" illustrates the proposed scheme for fMRI data analysis and detection of activated area that is composed of five stages: i) data preprocessing and dimensionality reduction ii) HRF modeling iii) convolution with fMRI signal, iv) computation of the m(), the belief function bel() for each voxel and v) separation of the activated voxels from nonactivated ones using threshold by using OTSU thresholding method because it permits to get a threshold automatically .

B. Data preprocessing
Prior to analysis, fMRI data goes through a series of preprocessing steps to identify and remove the artifacts and to validate model assumptions as well.First, the fMRI slices have been spatially realigned.However, spatial smoothing may cause unforeseen changes to occur into the data.Thus, spatial smoothing has been avoided to ensure better performance.Then, the mean value has been subtracted from each of the time series and the variance has been normalized to a unit.The previous steps were realized via SPM tools [9].

C. Modeling HRF by Dempster-Shafer method
We model a peak and a subsequent undershoot of canonical hemodynamic response function by DS method using the sum of two gamma functions known by the density of probability function, as described above.
The modeling process of the HRF function has been performed as follows: HRF function has been partitioned into two hypotheses(θ i , θ j ).The hypothesis θ i corresponds to both detecting neural activation and determining a peak (on activation) while θ j is assigned for modeling undershoots (off activation).Each hypothesis is a sum of degrees of beliefs.In particular, the focus of this work lies on the first hypothesis.This latter is divided into two parts A and D, where A stands for degrees of belief included in D. (D-A) denotes the uncertainty part."Fig.4" illustrates the proposed model.At first, we localize the interval of stimulus.In the example where the Time response (Tr=4), the peak is in the first interval [4..8] seconds [34].To find a second stimulus in this example, 16sc have been added.In the second step, we determine the next interval and so on.The same process has been repeated till the end of fMRI signals.Finding these intervals is the focal aim

D. computing the basic belief assignments and the belief measure
After the convolution process, the m() of each time (second) in fMRI times series must be first computed in order to compute the belief and the plausibility measures.Thus, the formula that consists in a transformation of each fMRI signal into a density probability function is described below.The used integral has the form: In the above equation, the global surface is denoted as α.So, m() is calculated as follows : A vector of probability have been obtained where the sum of mass bribability function (m()) is 1 as mentioned above in section 3. To compute the belief and plausibility measures, the formulation described in ( 3) and ( 4) has been employed.

E. Separating the activated voxels from the non-activated ones
To extract activated voxels, the belief measures have been employed in this stage.Each voxel of fMRI time series is presented by bel() value.At first, the histogram of belief measures has been used and an appropriate threshold denoted as λ has been chosen.The OTSU method [33] was employed to choose (λ) threshold.It permits extracting an automatic threshold that minimizes the weighted within-class variance σ w 2 (t).This turns out to be the same as maximizing the between-class variance σ B 2 (t).The algorithm is as follows: Step 1 compute the histogram of bel() measure and the probability at each i level of histogram Step 2 initialize the µ i (0) and q i (0) Step 3 Browse all possible thresholds t =1 to n • Update µ i (t) and q i (t) Step 4 λ = max ( σ B 2 (t) where the weighted within-class variance is: And the between-class variance is: The total variance is: where the class probabilities are estimated as: And the class means are given by: The individual class variances are: And [0,n-1] is the range of intensity levels of the histogram.

F. evaluation metrics and proposed algorithm
This subsection describes the metrics of evaluating the proposed approach and the proposed algorithm based on DS theory.

1) Evaluation metrics
The threshold λ has been used to compute two metrics, the true and false activation rate.These two terms need to be defined herein: True activation rate (TAR) stands for the ratio between the number of time series correctly identified as activated and the total of truly activated time series.And the other one is false activation rate (FAR) referring to the ratio between the number of time series incorrectly identified as activated and the total number of truly non-activated time series.Also, these two ratios serve to analyze the performance of the proposed approach and to establish a comparison with the previous conducted studies like the GLM.It has been noticed in the presented work that the voxels with belief measure more than or equal to λ have been considered to be true active voxels.And the voxels that are less than the selected threshold have been considered as false active voxels.This process leads to obtain the activated regions.

2) Algorithm DS fMRI analysis
To sum it up the proposed algorithm is illustrated as follows:  6) and ( 7 This section describes fMRI data that have been used in the conducted experiments .Both artificial and real fMRI data have been employed to determine the identically activated areas.To test the performance of the presented approach, a comparison of the obtained results with the GLM and t-test results has been performed.It is worth mentioning that the tests have been conducted on the same benchmark.This comparison has been done by using the true activation rate and false activation rate as defined above.However, we illustrate plots of true and false activation rates at different belief thresholds.

A. Artificial data
This section describes a form of artificial data used by Francois et al. [11].In general, fMRI signal is a stochastic process.So, a synthetic three-dimensional fMRI dataset (64, 64, 64) has been generated.The number of slices is 64 and each signal is generated by the following formula: The above function is a complex signal where A(t) stands for the amplitude.Let M be the levels of activation and let φ be the Gaussian random delay distributed with zero mean and unit variance.Let ω be the frequency of the signal and selected to be π/10 because the fMRI signal is relatively weak.The amplitude on such a basis is defined by a sinusoidal function as follows: We consider ʘ(t) = π/4 where the real and imaginary channels play a symmetric role and n c (t) are the complex Gaussian white noise centered with unit variance.The phase of this signal is not used, and we only consider the magnitude: We generate a set of signals in order to build sequences of fMRI time series as shown in "Fig.6".After applying the proposed approach on this artificial data, the obtained results are presented in "Fig.7" where brown areas stand for the activated voxels.8" presents some results using the simulated data described above.These results contain the TAR and FAR measures that have been obtained with bel() threshold.The ttest statistic method has been used in order to compare the introduced method.Basically, t-test statistic method have been used to compute the TAR and FAR metrics at each p-value between 0.001 and 0.05.At first, the fMRI time series are divided in two groups i.e. determine the fMRI time series called on activation denoted as (XON) and the fMRI time series called off activation denoted as (XOFF).To determine these both groups for all fMRI signals the box-car hemodynamic response function have been employed as kernel.The plots in "Fig.9", present the TAR and FAR obtained results with t-test method Fig. 8.The plots show the false activation rate and the positive activation rate usig belief threshold These results show that when the p-value is smaller, we get nearer to the activated areas and the number of false activation rate increases.Where p-value is near to 0.05, more precision is obtained in activation rate with less false activation rate.Table1 presents the mean of TAR and FAR for the proposed method and t-test method.Accordingly, these results obviously show the ability of the presented approach to detect true and false activation rate better than t-test method.

B. Real fMRI Dataset
This section reports the result of proposed method tested on a real fMRI dataset that concern an auditory stimulus.This data was collected by Geriant Rees et al. and are available in http://www.fil.ion.ucl.ac.uk/spm/data.These whole brain BOLD/EPI images were acquired on a modified 2T SIEMENS MAGNETOM Vision system.Each acquisition is composed of 64 contiguous slices (64x64x64 3mm x 3mm x 3mm voxels) where any acquisition occurs in 6.05s, with the scan to scan repeat time (TR) set arbitrarily to 7s.So that, 96 acquisitions were made (TR=7s), in blocks of 6, giving 42s blocks.Starting with rest, the condition for successive blocks alternated between rest and auditory stimulation that was bi-syllabic words presented binaurally at a rate of 60 per minute.In this experiment, the authors mentioned that the functional data starts at acquisition 4.
After modeling the HRF by Dempster-Shafer method (DS), a basic belief assignment, denoted as m(vi), has been calculated for all subset   of θ (where vi stands for ith voxel).We have noticed that all fMRI signals have a similar pace with a difference in values of m() which plays a primordial role in computing belief measures.This latter is used to characterize the voxel activity.Then, computing the belief measures enables to obtain the results at (TR = 7) which is in [0.2702, 0.3338].To separate the activated voxels from non-activated voxels, the histogram described in Figure "Fig.10" has been used and the threshold of belief measures (λ) by OTSU method has been selected automatically.To study the influence of the belief measure that presents the key parameters of proposed method, on the whole performance, we generate and plot activation regions (number of voxels ) at different values of belief thresholds more than λ obtained by OTSU method.In addition, we measure TAR and FAR values given by belief threshold.Figure "Fig.11" shows the number of true and false activation rate as a belief threshold.This experiment shows that the proposed method can identify more TAR with less FAR when bel() near to λ.And the number of true and false activation rate tends be lower when bel() threshold between 0.292 and 0.3

C. Comparison with GLM method
This section provides a comparison of the GLM results with the obtained results of the introduced framework.Firstly, the GLM results realized by SPM tools assumes that the fMRI time series correspond to the realization of an identically independent stochastic process and divides data into two groups, obtained during on (activation) and off (no activation) periods.This separation is done by p-value (0.05) and (0.001).The GLM results have shown the different projection (axial, coronal and saggittal) as well as the design matrix (see Fig. 12).Another important feature that distinguishes the GLM method is the long time needed for completing the job.In contrast to GLM, the method based on Dempster-Shafer theory is not difficult to understand and it is easy to implement but it needs to have prior knowledge about the experiment conditions.Fig. 12.The obtained result with GLM method However SPM tools provide the metrics true activation rate and False Discovry Rate (FDR) that plays an important role as well as false activation rate.In other words, it is a proportion of activated voxels that are false positives [34].However, we proceed to use p-value between 0.001 and 0.05 by using SPM tools that provides the results of FDR and the number of voxels detected activated in the regions used to compute the true activation rate measures.Figure "Fig.13", shows the result of this experiment.To sum it up, table 2 presents the average of TAR and FAR measures by the presented method and GLM method This experiment shows that the proposed method based on DS theory outperforms the GLM method in identifying more true activation rate with low false activation rate.Figure "Fig.14" presents some slices that show the activated areas by both method GLM and the proposed method This paper introduces a new analysis based on Dempster-Shafer theory (DS) that better separates activated voxels from fMRI time series by using basic belief assignment functions.The proposed approach aims to extract activated areas from fMRI data sets.Mainly, background is required about the hemodynamic response model at the beginning.The introduced method has been validated on a real auditory fMRI dataset as well as on an artificial dataset and its performance has been compared with GLM method.The obtained results have clearly shown the ability of belief measures to yield a better clustering of activated voxels.From the outcome of this investigation, it is possible to conclude that the proposed framework can be employed in most fMRI data analysis methods.Also, the findings suggest that the theory of evidence can serve to understand the nature of data and to obtain relevant results that can be used and interpreted by neuroscientists.The future work aims to use DS theory in analyzing fMRI-EEG data fusion to take advantage of both modalities in order to better study the brain activity.

Fig. 3 .Fig. 4 .
Fig. 3. Flow chart of the proposed model this conception."Fig.5"illustrates the projection of the proposed model with fMRI signal to extract all intervals.

Fig. 5 .
Fig. 5. the projection of the introduced model with fMRI data (where θ corresponds to a finite set of possible hypotheses)

Fig. 7 .
Fig. 7.The activate voxels with artificial data by the proposed approach Figure "Fig.8"presents some results using the simulated data described above.These results contain the TAR and FAR measures that have been obtained with bel() threshold.The ttest statistic method has been used in order to compare the introduced method.Basically, t-test statistic method have been used to compute the TAR and FAR metrics at each p-value between 0.001 and 0.05.At first, the fMRI time series are divided in two groups i.e. determine the fMRI time series called on activation denoted as (XON) and the fMRI time series called off activation denoted as (XOFF).To determine these both groups for all fMRI signals the box-car hemodynamic response function have been employed as kernel.The plots in "Fig.9",present the TAR and FAR obtained results with t-test method

Fig. 9 .
Fig. 9. true and false activation rates by t-test method

Fig. 11 .
Fig. 11.false and true activation rate obtained by DS method

1 Fig. 13 .
Fig. 13.false discovery rate and true activation rate obtained by GLM method

Fig. 14 .
Fig. 14. some slices illustrate the activated regions: a) the result is generated by GLM method; b)the result is generated DS method

TABLE I .
AVERAGE OF TRUE ACTIVATION RATE AND FALSE ACTIVATION RATE OF THE PROPOSED METHOD AND T-TEST METHOD TO THEARTIFICIAL DATA SET

TABLE II .
AVERAGE OF TRUE ACTIVATION RATE AND FALSE ACTIVATION RATE OBTAINED BY APPLYING THE PROPOSED METHOD AND GLM TO THE REAL DATA SET