Deep Learning Algorithm for Classification of Cerebral Palsy from Functional Magnetic Resonance Imaging (fMRI)

Cerebral palsy is a disorder of neurology that may be caused by prenatal, perinatal or postnatal reasons that result in the failure of motor functioning in children besides mental well-being. Referring to the location of brain injury and the effect of it on the muscle tone, cerebral palsy is classified into subgroups namely spastic, non-spastic etc. Each type of palsy varies in symptoms and hence the therapy planning and rehabilitation are decided depending on the factors involved in each type. This urges the requirement of a suitable technique to classify the type of Palsy at the earlier stages to effectively plan therapy. Functional MRI of the neonatal brain helps in imaging and classification of cerebral palsy. The deep neural network is a subset of machine learning that is widely used in image classification applications. This technique is applied to the functional magnetic resonance brain images of infants to classify the type of cerebral palsy using a deep convolutional network of modified AlexNet architecture that helps the physician further in a planned rehabilitation to facilitate the lifestyle of the affected children. Keywords—Cerebral palsy; deep neural network; functional magnetic resonance image


I. INTRODUCTION
Cerebral palsy is a disorder in neurology caused due to non-progressive brain injury or any malformation due to underdevelopment of the brain in preterm infants. This primitively affects motor actions and muscle coordination. The World Health Organization (WHO) reports that 10% of the total population is estimated to be affected, by any of the types of cerebral palsy and 3.8% of Indian population is a victim of this neurological disorder [1]. Diagnosis of cerebral palsy is highly challenging in infants. The fMR image of a cerebral palsy affected infant is shown in Fig. 1. When there is any delay in the motor or cognitive responses during the growing phase of the child, cerebral palsy may be diagnosed. But diagnosis after the elapsing of the critical period will not be effective in rehabilitation. Early diagnosis of cerebral palsy at the infant level will help the physicians to plan for occulomotor rehabilitation, which suitably helps in the Neuroplasticity and eventually improves the lifestyle of the cerebral palsy affected child [2].
The scientific verity of the intellectual organ, namely brain can be demonstrated by neuroimaging techniques like Functional Magnetic Resonance Imaging (fMRI) and Positron Emission Tomography (PET) scans. The functional magnetic resonance imaging helps in recording the brain activity with respect to the changes occur because of the variation in the blood flow.
fMRI is also widely used to study on the reorganization of the neural connectivity after early brain injuries and also viewed as a most vital tool in investigating neuroplasticity. Recent research have proved that even though fMRI is widely used in adults, there are few challenges observed when used in children especially in infants with lack of head stability during scanning and excessive anatomical variations in whole brain mapping etc. This increases the complexity in analyzing the fMRI and also classifying accordingly [3].
The existing methods of classification of cerebral palsy widely use the cerebral magnetic resonance imaging techniques that are majorly dependent on the grades of severity in periventricular (PVL) anomalies. MRI based classification mainly focus on the irregular ventricular dilation, widening of the inter hemispheric fissure, long lasting hemorrhage. Some researchers have also attempted to classify cerebral palsy with respect to the motor abnormalities namely diplegia, dyskinesia, spastic etc. The results were promising and the classification accuracy was good when the experimentation carried is between ages 2 to 6. It is also observed that very limited approaches are available to classify palsy before the age of two. The study has also complemented that diagnosing and classification of cerebral palsy at earlier stages is challenging owing to the factors like gross motor functions cannot be analyzed for the infants under this age group [4]. The study was concluded that out of 100 affected children, 12% of hypotonic, 81% of spastic, 2% of mixed CP, and 5% of dystonic cases. The mean presentation age was 2 years, 2 months, and it is in the ratio of 1:2 for male to female. S Surender et al in 2016 focused on the cerebral palsy impact on HRQOL of children and their families, relationship with the dysfunction of gross motor [7,8].
The rehabilitation council of Indian Academy of CP has acknowledged that there is positive growth in the medical interventions provided to CP affected patients. However, more scientific research and studies based on systematic documentation and validated reference materials in India may help the rehabilitation of the rural population.
But it is evident from the literature that the classification of the type of cerebral palsy from the fMRI analysis is challenging owing to the factors like the understanding of the functional connectivity of brain and requires much more investigations on the anatomical details regarding the classification on the type of Palsy [9,10]. These real time challenges requires a robust method to analyse the fMRI obtained from the infants irrespective of the nonlinear and complex changes in the neurons and classify accurately to aid in the therapy planning that results in better rehabilitation planning at the earlier stages.

III. METHODOLOGY
This research attempts to diagnose cerebral palsy at the earlier stages, particularly in the age group of six to sixteen months and also to classify the types of cerebral palsy [11] that helps in planning for better rehabilitation by overcoming the existing challenges in correlating the gross motor responses and the fMRI analysis. The overall flow of the proposed method is shown in the below Fig. 2.

A. Overview
Functional Magnetic Resonance Imaging is used in measuring brain activity with respect to the blood flow measurement. This non-invasive imaging technique is capable of even detecting a small change associated with neuron activities. This imaging is based on the Blood Oxygen Level Dependency (BOLD) of the brain cells and widely used in medical diagnosis owing to the high spatial resolution in the activated brain regions, their visibility with respect to the neighboring cells. Repeated stimuli also help in eliminating the imaging noise much better than in normal MR images besides the fact that, fMR image quality is increased by using spin echo pulse in accordance with the magnetic field strength. The fMR images of the cerebral palsy affected children are acquired. The BOLD signal is highly complex and non-linear because of the transient changes in the neurons and vascular structures. The images are preprocessed for the removal of random and other noises acquired because of image acquisition and the subjects themselves, being a neonatal group. www.ijacsa.thesai.org Also, the amplitude of the thermal noise increases with respect to the strength of the magnetic field. Fuzzy adaptive median filters are used in eliminating the noise from these test images [12]. Fuzzy adaptive filters reduce the additive, salt and pepper or impulse noise, but preserve the image details compared to the adaptive median filters.

B. Fuzzy Adaptive Filtering
Fuzzy adaptive filtering is a modified version of median filters that improve the visibility of an image by intensifying the smoothening effect besides removing the noise factors when compared to other traditional filtering techniques. This method involves the identification and comparison of each pixel in the input image and replacing the noise affected pixel with the median value according to the intensity of the local noise with increased flexibility. The intensity differences are measured using a sliding window function and the mean square error is calculated as follows, The noise-free image is now fed into the training phase of the deep learning networks framed of convolution, pooled and stacked layers and the architecture of the proposed method is shown in Fig. 3.

C. Deep Learning
Computer Aided diagnosis plays a vital role in medical imaging since 1990 for the detection of micro calcifications, lung nodules, pulmonary embolism and mitotic cells. In the recent years Computer aided diagnosis uses the extended version as the convolution neural networks (CNN) in the classification of pancreas, brain tumor and other medical applications. The major advantage of CNN is the ability to transfer the data interpreted in the pre trained layers to the next level. This transfer learning ability of CNN can be used for two different applications in medical imaging. The first application uses the pre trained CNN to generate the features required for training phase. On the other hand the pre trained CNN can be used in image classification by replacing the fully connected pre trained CNN layers by the logistic layer and the training is done only for the newly added layers. LeNet, AlexNet and GoogLeNet are the three major CNN that is implemented widely in image analysis specifically in medical image classification. AlexNet was introduced in 2012 by Alex Krizhevsky et al with a unique feature of introducing non linearity into the network through ReLU during the training phase that increases the speed when compared with the saturating nonlinear function including hyperbolic and sigmoid functions. This paper uses the modified AlexNet architecture where the different stages of information processing in multiple hierarchical structures are implemented to improve the accuracy of classification. Another important advantage of using AlexNet is that it is used in overcoming the over fitting problem [21]. This paper uses a modified AlexNet with five convolution layers followed by pooling layer. The convolution layers aid in detecting the local features throughout the input image. The local structures are detected by connecting each node to a subset of spatially connected neurons. The similar image pattern is searched in each input image channel by enabling three connection weights shared between the nodes in the convolution layer called as kernels. The number of kernels depends on the number of parameters to be detected from the input layer. The hierarchical set of image features is attained by adding pooling layers in subsequent to the convolution layers. The max pooling layer helps in the reduced size by selecting the features in overlapping and non-overlapping neighborhood and eliminating the maximum responses. This also results in improved translation invariance. This is followed by the regression or softmax layer that generates the expected output. The BPN algorithm is used in training the CNN that effectively minimizes the cost function. The proposed method consists of five convolution layers followed by two fully connected layers. The kernel size, of the first convolution layer is 12 so that each unit of the feature map is mapped to 12 X 12 neighborhoods and this layer is to normalize the local response by convolving after every 4 pixels. This layer leads to 96 feature maps, followed by the pooling layer with a kernel size, of 6 and stride rate of 2. This layer is again followed by the convolution layer 2 with a kernel size, of 6 and stride rate of 2. Thus the pooled feature maps in each layer are convolved subsequently in the convolution layer and fed into the two fully connected layers for the rectified linear operation. This modified AlexNet results with 4096 feature maps for each image as tabulated in Table I. Deep learning is a discipline of machine learning that is recently used in object recognition among a large volume of data. This is implemented effectively by rising the number of artificial neural network layers and each layer is designed to extract an exact feature that enhances the image classification [13,14]. After preprocessing, the images are split into subclasses for calculating the gradient of the images in each data set, thereby reducing the voluminous data and parallel processing occurs to reduce the computational time. [15,16].
Convolution neural networks (CNN) are widely used deep feed forward networks for image classification. CNN is more or less similar to feed forward neural networks, but the only difference between them is the connection pattern of the adjacent layers. CNN involves the connection of all nodes between the adjacent layers, whereas feed forward networks few nodes may be eliminated because of the complexity riveted in including too many parameters like over fitting, slow speed etc. This major challenge in feed forward networks is overcome by the CNN that includes kernel layers: convolution and pooling layers. The former layer uses only a portion of the previous layer as input in the size of preferably 3 X 3 or 5 X 5. But it deeply analyzes the limited input to gain maximum abstraction of the required features. The latter layer involves reducing the matrix size from the previous layers and hence minimizes the number of parameters in the entire network [17], [18]. This results in increased speed of computation and also avoids over fitting problems as in the case of feed forward networks.

1) Convolution layer:
The features that are identified to represent the input images are extracted in this convolution layer which consists of neurons oriented to form the feature maps. These feature maps are interconnected with the neighboring neurons in the preceding layer through predetermined weights [19], [14]. These are known weights are used in framing the new featured maps by convolving with the input, which results in a non-linear activation function. Even though the weights of all the neurons in a feature map tend to be equal, the feature extraction is effective in extracting different features at each level because different feature maps have variable weights. The n th feature map output is denoted by Yn. This is computed as follows, Where x is the input and Wn is the convolution window for the nth feature map, f(.) denotes the non-linear activation function that is featured to extract the non-linear features in the given input image.
2) Pooling layer: This layer reduces the spatial resolution of each feature map besides increasing the spatial invariance that occurs due to the distortion input. The average value of all the input neurons can be propagated to the next layer with an average pooling function when a relatively small window of the neighborhood image is considered. But the maximum value of the input is propagated to the next layer through the maximum pooling aggregate function by selecting the largest element of the receptive field as follows, m a x ( , ) Where the feature map output of n th element is denoted by Ynij and xnpq is the element in (p,q) in the pooling region with a receptive field around the location (i,j).

3) Stacked layers:
When more features are required to be extracted for the classification of images, then the number of pooling layers along with convolution is stacked over one another. Softmax operator is used as it is widely used for classification problem using the Back Propagation Training (BPT) algorithm. The output from the final stack is a vector function f(x) that mainly depends on the confidence in classifying the input x in a given class of feature maps and it is obtained by the summation of the class scores of each layer. The class scores will not be an integer and it is a floating point value that is generally unbounded but the final output of the softmax output is a multidimensional vector and it is bounded that ranges from 0 to 1. This function has an exclusive property of breaking down the maximum value to get a maximum part of the distribution and other elements are assigned to a part of the distribution. This property makes this method more suitable in interpreting images in classification problems. www.ijacsa.thesai.org

IV. RESULTS AND DISCUSSION
The fMRI of infants are collected from the open neuro, neuroimaging data, starplus fMRI data, CRCNS and oasis brain database and analyzed by synthesizing the images. The drift component, seasonal component, noise is removed by subjecting the images to fuzzy adaptive mean filtering.

A. Testing and Training
The data set obtained from the online sources is used in the training phase. Even though specific classification is not required in machine learning, since this research aims at the classification of medical images, the dataset is categorized specifically [20][21][22][23][24].They are categorized based on the type of palsy and age of the infants. The sample images of infants with various types of palsy are shown in Fig. 4.

B. Pre-training
The experiments were carried out using tensorflow, an open source framework available for building and training multilayer neural networks. Here, the weights of the convolution layer were trained on the dataset available on the website and screen shot is shown in Fig. 5. Tensorflow is used to build the coding for the deep learning algorithm.
Four hundred fMRI images were trained and one hundred and fifty images were tested. The output nodes of the CNN are converted into class probabilities by the softmax function. The error between the predicted class and the actual output class is the loss function. The major challenge in training CNN for medical images is the limitations in the availability of the labelled data set. Very few datasets of fMRI is available for research, namely, neuroimaging data, starplus fMRI data, CRCNS data, etc.
But the final output layers were trained with real-time fMRI images obtained from the cerebral palsy society and indicated in Table II.   The training of the framed network was repeated for the available dataset until the loss function is minimized by adjusting the assumed weights [23] , [25]. The training progress is continuously measured by mapping the training loss with the iterations. The number of images in each iteration is assumed to be twenty. The accuracy of each data set is also examined and plotted beside the loss function as indicated in Fig. 6. This challenge is overcome by the concept of transfer learning. Here the weights used in training a smaller dataset is derived from any large dataset with an assumption that the required image features are shared among the two data sets.

C. Comparative Result Analysis
The research experiment included training and testing phase with each set running 150 epochs. The accuracy tends to increase in each training set and losses were lowered consequently. Classification of fMRI is absolutely different from training MR images with increased complexity due to three dimensional time series nature. The training session is focused with high accuracy level and eliminating the false data with the best chosen hyper parameters. Thus the CNN with three layer model and twice the length size have converged with highest accuracy of 66.8% The performance of this algorithm can be analyzed by the confusion matrix shown in Table III. The image data set is categorized into five image groups comprising of twenty images in a group. The confusion matrix is useful in identifying the number of images classified properly as the type of cerebral palsy.