A Learning-based Correlated Graph Model for Spinal Cord Injury Prediction from Magnetic Resonance Spinal Images

— In epidemiological research on spine surgery, machine learning represents a promising new area. It is made up of several algorithms that work together to identify patterns in the data. Machine learning provides many benefits over traditional regression techniques, including a lower necessity for a priori predictor information and a higher capacity for managing huge datasets. Recent research has made significant progress toward using machine learning more effectively in spinal cord injury (SCI). Machine learning algorithms are employed to analyze non-traumatic and traumatic spinal cord injuries. Non-traumatic spinal cord injuries often reflect degenerative spine conditions that cause spinal cord compression, such as degenerative cervical myelopathy. This article proposes a novel correlated graph model (CGM) that adopts correlated learning to predict various outcomes published in traumatic and non-traumatic SCI. In the studies mentioned, machine learning is used for several purposes, including imaging analysis and epidemiological data set prediction. We discuss how these clinical predictive models are based on machine learning compared to traditional statistical prediction models. Finally, we outline the actions that must be taken in the future for machine learning to be a more prevalent statistical analysis method in SCI.


INTRODUCTION
Movement and sensory impulses from the spinal cord, peripheral nerve system, and brain (SC) are key conducts. The nervous system is made of SC and the brain. It has a tubular structure and grey and white matter, including spinal tracks (the bodies of neurons) [1]. SCI is due to the damage in the spinal tracks while carrying information, and damage in the motor and nervous systems results in [2]. Patients may experience paralysis or have their organs cease working properly due to an SCI. We can evaluate SCI patients more precisely because of the motor and sensory ratings provided by the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI). American Spinal Injury Association (ASIA) created these scores, which have since been modified multiple times [3]. They are crucial for determining a patient's SCI sufferer's prognosis in a therapeutic rehabilitation program since they are connected with functional status [4]. For a reliable diagnosis of SCIs, clinical evaluation based on ISNCSCI scores has limitations. It relies on the patient's input, which is subjective and ambiguous when there is concurrent damage to other organs [5].
For the diagnosis of SCI, conventional MRI is frequently employed. A medical imaging technique called MRI creates detailed macroscopic images of organs and tissues. It uses a striking image (black & white) to discriminate between hard and soft tissues [6]. Modern technology called Diffusion Tensor Imaging (DTI) employs echo-planar MRI data. Using the tissue's architecture and structure, it monitors the movement via the SC and brain tissue, water molecules [7]. DTI is used to study pathological conditions and disorders such as multiple sclerosis, hypertensive encephalopathy, and brain tumours. It gives quantitative data on the size and placement of a three-dimensional (3D) space containing each tissue. Diffusion anisotropy is the word used for this. Numerous floating diffusion ellipsoids make up the diffusion tensor [8]. Each diffusion ellipsoid's orientation is specified by a group of vectors that indicate orientation, often called eigenvectors. A distinct outcome matching an eigenvalue is produced when an eigenvector's length or direction is altered. DTI allows for the expression of diffusion anisotropy as fractional anisotropy (FA) [9]. FA, which has a scale from 0 to 1, is frequently used to determine the degree of fibre integrity since it is sensitive to the number of directionally directed fibres per voxel. Water diffusion anisotropy is measured by the FA value, with a higher degree indicated by a higher FA value [10]. In this research, we provide a brand-new Machine learning-based SC analysis technique. One of the main professionals in diagnosing SCI is [11]. Currently, choices are made using human specialists' analysis of FA values and DTI pictures. If we can provide them with more factual data, they will be able to diagnose more precisely. Classification systems are used in machine learning to make predictions or diagnose problems [12] - [13]. However, classification jobs call for training data. We create a training dataset for our method utilizing four FA values from patient and healthy control slice images. The base dataset is then expanded to 15 features present in a dataset with more dimensions, and the intended dataset is abstracted to increase classification precision [14]. Prediction accuracy for the generated dataset is higher than 90%. Our two contributions are using a classification technique to predict SCIs and creating a training set of images produced by the DTI. The data connected to a specific person www.ijacsa.thesai.org is more than 200 MB [15]. There is a huge challenging factor to employ in computer-aided diagnosis; raw data is large. The major research challenge is the complexity in accurate prediction of spinal cord injury using least sample dataset. Some existing machine learning approaches fail to give better accuracy due to lesser number of samples. However, this can be resolved using the advanced learning approaches. This motivates to adopt a novel learning approach for predicting accurate spinal cord injury and to enhance the accuracy with available samples. In our plan, we take the raw DTI data and extract meaningful numerical information that we subsequently use for diagnosis. Any field in which DTI is used for diagnosis can use our method. This work intends to validate the efficiency of the anticipated CGM and explores the prediction ability with the construction of SCI-based functional connectivity. Here, the behavioural relationship between the injury regions is analyzed from the available online dataset. The proposed CGM constructs the connectivity pattern among the injured region to predict the differences from other regions. The experimental outcomes demonstrate that the anticipated CGM-based prediction model outperforms the overall approach significantly. The model is more reliable and stronger in its prediction nature.
The work is organized as follows: Section II offers a comprehensive analysis of prevailing approaches; Section III gives a detailed analysis of the proposed graph-based model and correlation analysis. In Section IV, the numerical analysis of the anticipated model is provided, and the results are discussed. The summary is provided in Section V.

II. RELATED WORKS
Machine learning is a broad field that primarily applies computing models to many real-world situations. The primary objective of machine learning relies on the development of algorithms using information from a database. Recognition, diagnostics, planning, robot control, and prediction are tasks that can be accomplished when it is employed. Additionally, it can use machine learning to analyze neuroimaging data and predict tissue toxicity [16]. Both tasks use pattern recognition which requires the identification of numerous important variables. Nowadays, enormous data is employed by the researchers that must be managed, analyzed and used. Large amounts of data may conceal significant linkages and correlations that are uncovered via machine learning. It enhances the effectiveness of systems and machine design. Medical image analysis, lesion segmentation, and computeraided diagnosis have turned as key application areas for machine learning.
In biological studies, classification is a major task where machine learning is crucial to the classification process. With well-known dataset, unknown sample data can be predicted using machine learning. It can distinguish two or more disparate items, combine related objects, or divide different objects. During the classification process, objects are categorized based on their unique characteristics and each item is given a class name to indicate the specific category to which it belongs ("patient," "normal‖) [17]. Predictions are made using training and testing of unlabelled data. Test dataset contains unidentified sample that are required to establish the class label and the performance of the training dataset is evaluated. Two popular categorization methods are k-NN and SVM). K-NN is based on instances of feature-space classifiers that select the most nearby data points for classification choices.
The numbers of characteristics are redundant and irrelevant while classification accuracy is preserved by feature selection. Statistical ML is widely utilized before classification and creates a powerful and stable predictor. In addition to noisy data, feature selection manages exceedingly big datasets. Feature selection makes classification faster and efficient by reducing the dimensions of the data. Feature selection methods like Clearness-Based Feature Selection (CBFS), Features selection based on a distance discriminant (FSDD), R-value-based Feature Selection (RFS), ReliefF, and CBFS are some examples of feature selection. R-value [18], is a statistic for measuring the region of overlap between classes in a feature. Identifying traits that promote effective class separability across classes and maximizing the proximity of samples within the same classes form the basis of the FSDD method. ReliefF is one of the most effective feature selection methods. The concept is to estimate feature weights iteratively based on how well they can distinguish between nearby examples. Based on "CScore" metrics, CBFS is an excellent feature selection technique. Many samples were located in the right class region was determined by the score presents an alternative feature selection approach based on the Lasso. This method establishes a scoring system to determine the "quality" of each distinct feature. Several samples are created using training data and then high-relevance feature orderings are chosen for each sample. Finally, highly relevant properties are integrated. This study use selection of features to assess each feature's discriminative power and identify the most distinct feature subset [19] - [20].
Sagittal and axial panels have undergone T1-and T2weighted imaging to assess SCI separately. Clinical evaluation is performed to gauge neurological damage and its seriousness was measured using MRI technology. The signal change level and clinical outcomes were linked [21]. We automatically applied classification technique to distinguish patient image slices. The system quickly and accurately produces outcomes and integrates algorithms easily. A key indicator of a prediction quality is classification accuracy. Many academics have tried to increase classification accuracy through algorithm or dataset improvements. Obtaining FA values from DTI is to help people find the impacted area. Human experts heavily rely on the personal knowledge they have gained from earlier assessments of T1-or T2-weighted pictures, even though FA value validates SCI. An automated system can identify SCI to diagnose the condition of the affected area and offer pertinent data would be beneficial [22].
Since there is currently no cure for SCI, individuals with motor-related injuries have little chance of sustaining voluntary movement recovery over the long term (more than a year after the injury) [23]. There is growing evidence that neuro-modulation may be viable for chronic and persistent SCI based on recent reports of effective partial functional recovery. [24]. Even with these positive case studies and series, there are still a lot of problems to be fixed before a www.ijacsa.thesai.org conclusive clinical trial. The ability to customize each patient's spatial and temporal neuro-modulation has increased with the technological advancements in implantable neuro-modulation platforms. However, large, high dimensional flexibility necessitates effective algorithmic optimization customized to each patient's unique pathologies and underlying physiologic system.
Based on simulation conditions, research using animal models has shown huge responses in voluntary movement. Further evidence of this variance was found in human models, necessitating the spatiotemporal adjustment of eSCScustomized patient-specific characteristics [25]. It is crucial to find the optimal parameters because the heterogeneity of SCIs may cause the observed variation in response to the stimulation parameters. A reliable system for choosing the best settings must be created for electrical stimulation therapy because there are trillions of potential configurations for 16lead paddle. Some studies are currently available on eSCS optimization techniques for causing volitional movement after cSCI. The majority of research uses animal models, and various teams make use of various optimization strategies. To identify the spinal circuits and fibres drawn by eSCS, the author developed computer model that merged 3D finite element approach with rat spinal cord model. The ideal parameters for standing and walking rats are predicted using different electrode configurations. Specific muscle responses to be evaluated by electromyography (EMG) in spinally transinsected rats were chosen by bipolar stimuli based on Bayesian optimization [26]. Another animal experiment focused on enhancing stepping by adjusting stimulus intensity, the time between pulses and strength. Some staff members determined the ideal frequency and intensity subjectively. Then, kinematic data, EMG, and various stimulation pulse intervals were used to re-evaluate these parameters. To find the best parameter combinations, one study used 3D kinematic data recorders and quantitative gait characteristics [27] - [28].
Even less research has been done on enhancing stimulation in human model systems than in animal models. Various optimization goals and techniques have been tested in various investigations. A map of each participant's motor neuron activation was produced to identify where the spinal cord was engaged during particular muscle movements. Using computerized model, the best electrode combinations were identified through simulations [29]. EMG mapping data has also chosen [30]. The limitations of current optimization approaches make it evident that all-encompassing strategy for choosing the best model for SCI prediction. The major research gap is a lack of proper methodology for feature selection and classification even with small datasets. This may leads to poor prediction outcomes. Therefore, this research concentrates on modelling an efficient approach for prediction.

III. METHODOLOGY
This section gives a detailed explanation of the proposed model for soft tissue prediction using learning concepts. We initially provide a brief overview of a few terminologies, and there are some definitions of graphs and graph signals. Using SVR, it is then determined how the FC patterns relate to the appropriate behavioural measure; refer to the framework in Fig. 1. Next, we build FC patterns using the spinal image and our correlated graph model (CGM). Finally, we use simulated data to validate the proposed CGM.

A. Graph model
Consider the FCN to be an undirected, linked, weighted graph. Let's build a graph with where is a collection of nodes (ROIs) and [ ] is a weighted adjacency matrix that is symmetric (and frequently sparse), with representing the degree of similarity between the nodes. The degree matrix generated by the diagonal matrix with its diagonal , member ∑ is referred to as the "Laplacian matrix." As a result, we use either the Laplacian matrix, or the weighted adjacency matrix of graph which may uniquely describe the underlying graph, to define the pattern quantitatively.
be a signal on the graph that links value-based feature selection and features selected using a distance discriminant, overall variability about the Laplacian matrix L as depicted in Eq. (2): The smoothness of the graph-wide variation of a graph signal measures the size of the change. Given that nodes with high edge weights are densely coupled, it makes sense that when is sizable, the gap between and will also be narrow. As a result, many machine learning methods like graph regularization and transductive learning have successfully used this graph smoothness notation. According to the perspective of graph signal processing, by utilizing the www.ijacsa.thesai.org graph frequencies provided by the eigenvalues to define different degrees of graph signal smoothness, the eigenvectors of offer Fourier transform for graph signals. The pattern in this work was different from the pattern predicted by our CGM due to the magnitude of squared spectral coherences among the time courses of paired ROIs and the examination of brain signals related to diverse graph frequencies.

B. Correlation Analysis
Consider The behavioural measure is then coupled with Pearson correlation matrices; correlation matrices and correspond to each between-ROI correlation coefficient. (4)

Where refers correlation coefficient across all people,
[ ] is the among ROI correlation coefficient and [ ] is the behavioural measure. Then, we choose the significant correlation coefficients in from the value matrix that are associated with the behavioural measure using a certain threshold. Thus, the matrix is expressed as: With the label information of the subjects, we can drastically reduce the amount of redundant or irrelevant characteristics by employing matrix as a guide for learning the pattern. To estimate the pattern, the CGM technique, in addition to the conventional following, is the graph learning technique: Here, ̃ and refer graph's Laplacian matrix (FC pattern), where is computed with Eq. (5), refers to the positive regularization parameter and and specifies matrix norm and element-wise product respectively. The CGM transforms into the conventional graph learning methodology when the threshold is or there is no correlation guidance, as in Eq. (6), where the matrix .
It is important to note that by minimizing their fluctuations on the learnt graph, the first component of the goal of Eq. () is to match the observed spinal cord injury image with the learned graph. The second term helps to further eliminate duplicate characteristics by regulating the sparsity of as denser as is greater and vice versa. Additionally, the second and third conditions are included to guarantee that the first restriction is imposed as normalization and learned refers to a legitimate positive semidefinite Laplacian matrix. There is a safeguard against trivial solutions. Cross-validation will be utilized to identify the hyper-parameter .
To create subject-specific patterns that reflect a common template matrix , the CGM was constructed in Eq. () for both the link between and the graph organization between ROIs. Specifically, by obtaining the learned Laplacian matrices, we use graph-weighted matrices for to reflect patterns of individuals by applying the convex optimization. The vector length represents the FC features of individual produced by triangle symmetric section.
From these results, the successfully recovers patterns that have more power to discriminate between people, as they are intrinsically relevant to the targeted behavioural measure suggesting the superior performance of behavioural prediction. Therefore, we do a regression analysis to determine how the generated patterns relate to each other using linear with default settings and the behavioural measure in this work to verify the efficacy of the suggested CGM. Fig. 1 is used to train a prediction model. To further clarify how subjects are split into training and test sets, see 1 for more information. After the test subjects' generated FC patterns are created, the training set's framework is fed into the trained predictive model to provide projected behavioural measures. Although there are many other regression models, the higher effectiveness of the recommended for the optimal pattern prediction is the focus of this study rather than the best regression model schemes.
Concerning the simulated data on node random weighted network, we verified the CGM made two processes to create the random graph. First, the graph's structure was created probability connections between each pair of nodes, giving the linked edge between the two nodes a uniformly distributed random weight between Then, a random multiplication of matrix of the zero-one template. With edge weights, the graphweighted matrix was produced by linearly coupling the first four eigenvectors of Laplacian matrix Here, graph signals are then produced as linear combinations of the eigenvectors (first four) of the graph Laplacian matrix . ∑ Where refers to an eigenvector of and refers uniform random variable with [ ] range. We inferred the graph structure from these graph signals regarding the www.ijacsa.thesai.org different values of , specifically the values of . We discovered that produced the highest level of performance as determined by the graph Laplacian matrix and Normalized Mutual Information (NMI) discovered during signal processing and the ground truth. The learned graph Laplacian matrices are shown in Fig. 2 for , respectively. We used a separate zero-one template matrix , which had 40 randomly generated elements to emphasize the significance of ensuring a CGM-acceptable zero-one template matrix. Comparing the NMI performance of the with and ̂, we varied, as shown in Table I.

IV. EXPERIMENTAL SETTINGS
In our trials, we used state SCI data to predict various measures. Using a 5-fold CV, five times using SVR was used to predict individuals' behaviour using FC patterns derived from the CGM (with LIBSVM). The entire group of patients was divided into five roughly equal-sized disjoint subsets by chance; chose one subset at random as the test set, and the training set was used with the other four. 20 rounds of this process were performed to reduce how sample bias affects cross-validation (CV). The prediction performance (average) over all 20 iterations was reported using fold CV. The hyper-parameter was found by employing the grid search ranges from [ ] and the inner CV of the training set. The correlation between the anticipated and observed behavioural variables among people in the test set and Root Mean Square Error were used to evaluate the prediction performance. The template matrix was generated at a random significance level. To find the optimal threshold, with values of , we repeatedly examined five distinct value criteria.

A. Dataset
The significance of MRI is to provide superior discriminate soft tissues, along with its capability to acquire heterogeneity and tumour changes. The available online dataset known as Cancer Imaging Archive is used in this research. The dataset comprises CT/PET/MRI scans of 51 patients. Another dataset with 21 patients is also considered. Here, 11 pathologically verified Liposarcomas arise with the soft tissue, and 10 Leiomyosarcomas influence muscle cells. The cohort is composed of 9 females and 12 males with a duration of 31 months. Ground truth discriminating histopathological subtypes is definite. Tumours are localized in the pelvis, biceps, and thigh. Here, three different types of MRI are utilized for training T1-, T2-weighted fat-saturated, and short tau inversion recovery (STIR). With the T1 sequences, the data acquisition is made of the axial plane, while STIR and T2FS are acquired in diverse orientations (coronal, sagittal, and axial). MRI scan with slice thickness is 5.5 mm for T1 and 5 mm for T2-weighted fat saturated. The plane resolution is , and for T2FS, T1, and STIR scans.

B. Result Analysis
It's important to remember that in the proposed CGM, the graph model is used to evaluate and identify functional linkages that we need to assess. A significance test is then performed on the degree to which these chosen linkages are connected and the between correlation coefficients with the participant-wide behavioural measure of interest as in Fig. 3. Each behavioural measure shows the mean SCI functional network patterns among patients employing the related techniques separately. The based patterns are clear to see based on graph learning and are significantly denser than the FC patterns based on the graph model. The proposed model generates patterns that are substantially less dense than those generated by the correlated graph model. The based patterns show greater variability in multiple functional connections, further reinforcing connectivity with high functionality strength and vice versa. It highlights 's usefulness and combines pair-wise correlation and graph learning. It assesses the prediction performance of the suggested model for each behavioural indicator in Table II and Table III. The recommended CGM had the best and based prediction performance. By contrasting these outcomes, we may demonstrate the suggested CGM's efficacy and efficiency based on its superior performance. To extract more discriminative patterns for building -behavior linkages, it may be advantageous to merge temporal correlation among ROIs with the graph structure across ROIs (GL) of CGM.
We also looked into how the parameters in the suggested CGM affected the accuracy of behavioural prediction. First, the value threshold used in the overall results of the CGM for each behavioural measure is impacted by the method used to construct the template matrix. The value threshold determines how many functional connections in the CGM must be learned. With a lower value threshold, The CGM's graph learning stage will accept fewer connections, indicating a more rigorous connection selection process. Except for www.ijacsa.thesai.org WRAT prediction, the value threshold selection affects how well other predictions. Second, we directly computed the prediction results using a fixed and carefully chosen value cutoff for creating the template matrix and 20 iterations of the 5-fold cross-validation with various hyper-parameter values. It was done to investigate the CGM's sensitivity to the relevant regularization parameter Fig. 4 presents the results. We discovered that the results varied depending on the regularization parameter's value and obtained the best performance in each case.   Finally, we evaluated the most distinctive biologically significant functional relationships that may be linked to these three behavioural characteristics using the FC patterns produced by the proposed CGM. As imaging biomarkers affect a person's variance in the mentioned behavioural assessments, the identified connections may be a supervised learning technique called the SVR assigns distinct attributes distinct weights to most closely mimic the response values in the training set. In this investigation, we emphasized the linkages to which the trained SVR gave more weight. Finally, using the patterns produced by the proposed CGM, we independently explored the most discriminative functional links with hypothesized biological values associated with these three behavioural traits. Imaging biomarkers, which influence the individual variation, the relationships found may be employed in the aforementioned behavioural tests. Fig. 5 and Fig. 6 depict the performance comparison of the proposed model.

C. Discussion and Analysis
In this study, we presented the CGM, which builds by considering the graph structure between ROIs, the relationship between ROIs at different times, more discriminating FC patterns, and constructing brain-behaviour correlations that have been discovered. The CGGL was then used to individually predict three behavioural measures of SCI data from the resting state using the publically available PNC www.ijacsa.thesai.org datasets. In prediction performance, this method outperformed competing pattern estimating techniques. The CGM offers a potentially reliable and efficient solution for examining the connections between the brain and behaviour to estimate FC patterns. Fig. 6. Accuracy comparison.
The pattern generation methods based on deep learning have drawn more attention recently, largely because of the incredibly high prediction performance. Fundamentally, layerby-layer learning from SCI courses is the basis for deep learning-based patterns, which typically have more complicated hidden information. On the other hand, deep learning techniques inevitably entail several variables. Substantial training datasets are frequently required to account for the weights and biases of the various layers and expensive computing power to optimize these parameters. However, with fewer parameters, our suggested technique can produce discriminative patterns. In the case of small samples, this reduces the over-fitting issue and increases generalization capacity and effectiveness. Gaining an anatomical understanding of which functional relationships result in individual variation in the relevant behavioural measure was another goal of this investigation.

V. CONCLUSION
In this study, we introduced the CGM, a novel technique for creating spinal cord injury FC pattern patterns. The recommended CGM combines two widely utilized FC pattern analyses with graph learning and Pearson's correlation. Both the graph structure across ROIs and the relationship between ROI points and time were considered. As a result, the suggested CGM has a lot of promise for improving the generated FC patterns' prediction ability for establishing SC injury correlations and collecting insightful knowledge about the biological processes involved in the behavioural measures of interest. By independently predicting three behavioural variables, we assessed the effectiveness and efficiency of our suggested CGM using available data from the available sources. The experiment's findings supported the proposed CGM's superiority over other FC pattern estimating techniques which have broad implications in brain network analysis. In future, this study can be further extended with adoption of a novel optimization approach for attaining the global outcomes in terms of accuracy and prediction. Also, with the adoption of pre-trained model, the time complexity can be reduced effectually.