Predict Students’ Academic Performance based on their Assessment Grades and Online Activity Data

The ability to predict students’ academic performance is critical for any educational institution that aims to improve their students' learning process and achievement. Although students’ performance prediction problem is studied widely, it still represents a challenge and complex issue for educational institutions due to the different features that affect students learning process and achievement in courses. Moreover, the utilization of web-based learning systems in education provides opportunities to study how students learning and what learning behavior leading them to success. The main objective of this research was to investigate the impact of assessment grades and online activity data in the Learning Management System (LMS) on students’ academic performance. Based on one of the commonly used data mining techniques for prediction, called classification. Five classification algorithms were applied that decision tree, random forest, sequential minimal optimization, multilayer perceptron, and logistic regression. Experimental results revealed that assessment grades are the most important features affecting students' academic performance. Moreover, prediction models that included assessment grades alone or in combination with activity data perform better than models based on activity data alone. Also, random forest algorithm performs well for predicting student a cademic performance, followed by decision tree. Keywords—Predict student performance; learning management system; data mining; educational data mining; classification model


I. INTRODUCTION
Educational data mining (EDM) is an emerging field in data mining; aims to transform data accumulated in the educational system into information help educational institution to make informed decisions [1]. EDM uses data mining tools and techniques in education field to analyze student performance, predict their outcomes to help students at risk of academic failure, and provide feedback for the faculties and instructors to improve student outcomes and their learning process [2]. Most of the previous works have proved the effectiveness of data mining to address various educational issues. Student performance prediction is one of the most important issues studied by data mining techniques.
Moreover, the growing use of the internet in education produced a new context called web-based learning or learning management system (LMS). LMS is a web-based application for managing online learning. LMS allows an educational institution to manage students, monitor their participation and tracking their progress via the system [3]. LMS can provide accurate insight into student online activity and their learning behavior because all data related to students' actions and events are monitored and recorded [4]. These data can be useful to analyze students learning behavior and create prediction models for their performance.
Predicting student performance is a crucial issue for each educational institution aims to improve students' performance and their learning process. Based on the prediction output, an educational institution can support those identified as low performing students. Although predicting students' performance is widely studied, it still a challenge and complex process because students' performance influenced by different features such as demographic, social, academic, economic, and other environmental features [5,6]. Cognition of these features contributes to control their impact on student performance.
The main objective of this research is to investigate the impact of assessment and activity features from LMS on students' academic performance. Based on one of the most common data mining techniques for prediction, namely classification. Five classification algorithms are applied include decision tree (J48), random forest (RF), sequential minimal optimization (SMO), multilayer perceptron (MLP), and logistic regression (Logistic) for predicting students' performance.
The rest of this paper is structured in six sections. In Section 2, a review of the related work is presented. In Section 3, concepts and definitions related to this research are introduced. Section 4 explains the methodology followed to predict students' academic performance and identify the important features that affect their performance. Section 5 discusses the experimental results with previous works. Section 6 presents the conclusion and limitations of this study. The insights about future work are provided in Section 7.

II. LITERATURE REVIEW
In recent years, many researchers studied features extracted from Learning Management System (LMS) and whether can be used as predictors for students' academic performance. As in [7] researcher investigated the important behavioral indicators from LMS to predict student outcomes in online courses. The researcher considered indicators that reflect regular study as important features can be used to predict student outcomes. Other researchers investigated the impact of students' participation in an online discussion forum on their academic performance [8,9]. In [8] the authors used qualitative, quantitative, and social network forum indicators to predict student performance. While in [9] students' performance was www.ijacsa.thesai.org predicted based on participation in the discussion forum and their academic records (e.g. assignments, quizzes, and exams).
Moreover, the impact of student online activity on their academic performance have been studied in different forms. In [10] researcher looked into four features of student activity on Moodle that are the number of viewed files, exchanged messages, completed quizzes, and content created by the student. While in [12] performance of 22 students was predicted based on their academic records and time spent by a student on Moodle. However, in [11] researchers considered online assessment data as indicators for student activity. They examined activity data from LMS in the form of assessments and exams to improve student engagement in a blended learning. Another study has predicted the performance of students based on enrollment data and activity on LMS [6]. They considered the heterogeneity of different students' subgroups to predict their performance based on important enrollment data (e.g. gender and attendance type) and the level of their online activity.
Other studies looked into the different feature sets to predict student academic performance rather than all features in the dataset. As in [13] authors investigated the influence of different feature sets such as course features, student features, behavioral features, and past performance in the course on students' performance. Also, in [14,15] they examined the impact of different feature sets such as demographic, academic, behavioral, and extra features related to parents' participation in the learning process and student absence days. Furthermore, other researchers proposed the use of sub-groups (or subdatasets) to construct effective prediction models, as in [6] they divided students' dataset into sub-datasets using enrollment and activity data to predict their academic performance. Moreover, sub-datasets is used in [16] to predict student dropout at academic institutions using enrollment data, first-term, and second-term data.
Many works employed feature selection algorithms to create an effective classification model by excluding irrelevant and redundant features from the dataset [9,3,17,18]. Feature selection algorithms can be divided into two basic groups are filter and wrapper methods. Different feature selection algorithms have been applied and compared in past works, as in [19] comparative study conducted using filter-based methods to evaluate the performance of the classification algorithm before and after feature selection application. Moreover, in [20] the performance of different filter-based methods was compared for predicting students' performance in the final exam. Also, in [21] researchers evaluated and compared the performance of different filter and wrapper methods on the dataset that has been gathered for predicting students' grades in the final examination.
Classification is one of the most common data mining techniques for prediction. Classification is a supervised learning process that predicts the class label of the target variable for a given dataset. In the classification model, the dataset is partitioned into two sets are training set for the learning process and test set to implement the classification process. Several classification algorithms have been used in previous works to predict students' academic performance [22]. In this research, five classification algorithms are used include Decision Tree (J48) [23], Random Forest (RF) [24,2,12], Sequential Minimal Optimization (SMO) [13], Multilayer Perceptron (MLP) [25] and Logistic Regression (Logistic) [26] for predicting students' academic performance. These algorithms are used depending on their effectiveness in previous works for predicting students' performance.
Decision Tree (J48) is widely used for classification. J48 uses the C4.5 algorithm for constructing a decision tree. It similar to a tree structure and consists of three types of nodes are root, internal, and leaf nodes. This method partition the training set into several subsets recursively using the best features selected by merit criteria until it reaches termination, the termination occurs when all features values belong to a class label [27]. Random Forest (RF) constructs multiple decision trees instead of a single decision tree. Trees are constructed based on different samples and features selected randomly from the dataset to form the forest. It gets the result of the prediction from each created decision tree and selects the best prediction result based on the voting process [28,29].
Sequential Minimal Optimization (SMO) uses an optimization technique for training support vector machine (SVM) [6]. SMO performs the classification process by finding the linear hyper-plane that can differentiate between classes very well. It can also deal with non-linear classification problems using "kernel" technique to convert low dimensional data space to a higher dimension that allows classifying the data [30]. Logistic regression (Logistic) is used in classification problems for prediction based on probability concept. This algorithm differs from linear regression by using "logistic function" instead of linear function for mapping the values of the prediction to probabilities. The probability of a dependent variable that has a binary value is predicted using a set of different independent values [30,29]. Multilayer Perceptron (MLP) is a multilayer network of interconnected neurons. Neurons are represented in three layers include input, hidden, and output layer. MLP uses "sigmoid function" in hidden and output layers to predict probability [30]. This algorithm learns in the training process by adjusting the weights iteratively using a backpropagation function to attain sufficiently good output [31].
Previous works have studied the impact of different features on students' academic performance, but few works have focused on the impact of assessment and activity data together. Moreover, most of the previous works have used the whole dataset to construct the prediction models. These comprehensive models may unuseful to identify the effect of different features on student performance. However, this work contributes by investigating the impact of assessment and activity data jointly and separately. Sub-datasets is used to create prediction sub-models instead of the whole dataset; subdatasets have been used in [6]. This work differs from [6] by studying other features related to students' assessment data and their online activity in the form of course access and mobile course access measurements. Additionally, feature selection using two different methods are applied to identify the most important features that affect students' academic performance. Finally, the performance of created prediction models is evaluated and compared. www.ijacsa.thesai.org

A. Imbalanced Class Distribution
The imbalanced class distribution occurs in the classification model when the number of instances in one class is significantly lower than the number of instances in the other class. A class with a small number of instances called minority class, while the class with a large number of instances known as the majority class. The performance of machine learning algorithms is best when the classes are almost balanced in the dataset. Hence, the application of machine learning algorithms on an unbalanced dataset leads to bias the result into the majority class [30]. Several solutions have been proposed in previous works to handle an imbalance in the dataset [32]. This research looked into feature selection and sampling algorithms for solving the imbalance problem.

B. Feature Selection
Feature selection considered one of the most important data pre-processing steps, is used frequently in previous works to identify the relevant features as a subset of the original features in the dataset [3]. The subset produced by feature selection allows classifiers to reach the optimal performance and can be a good solution for imbalanced classes' distribution [32,33]. This research looked into two methods of feature selection are filter and wrapper methods. Filter method uses a ranking technique to rank the features; the highly ranked features are applied to the classifier, while other features are excluded from the dataset [3]. While wrapper method selects a subset of features using an induction algorithm as a "black box" method to search for a good subset of features [20]. The accuracy of the inducted algorithm is estimated using the techniques of accuracy estimation.

C. Sampling
Sampling (or Resampling) is a technique used to resample the dataset artificially for balancing the number of instances in the classes [34]. It is considered a data pre-processing step and can be achieved by two ways are under-sampling the majority class and over-sampling the minority class.

D. Environment
All algorithms used in this research are implemented using the Waikato Environment for Knowledge Analysis (WEKA), which has been developed by Waikato University in New Zealand [31]. WEKA is a software tool based on java language, provides several algorithms for machine learning and data mining application.

E. Performance Evaluation Measures
This research used different evaluation measures that have been used in the literature to evaluate and compare the performance of classification models. These measures are (1), (2), (3), (4), (5), (6), and (7):  Accuracy [35]: is the common measure used to evaluate the performance of classifiers, calculates the ratio of correctly classified instances to the total number of instances.
 Precision [36]: is used to evaluate model exactness. It represents the ratio of true positive instances from all instances classified as positive by a classifier.
 Recall [36]: is used to evaluate model completeness. It represents the ratio of true positive instances classified correctly by the classifier.
 F-measure [36]: is used to get the average value of precision and recall. It used commonly by researchers to compare different classifiers performance.
 Area under ROC curve (AUC) [35]: is used to evaluate the capability of classification model to distinguish between classes. Its value figures out the tradeoff between true positive rate (TPR) and false positive rate (FPR) for a given classification model.
 Kappa value [6,29]: is used to measure the accuracy of the classifier compared to the expected random classifier accuracy.
 Root Mean Squared Error (RMSE) [17]: is used to compare prediction errors by evaluating the difference between the actual value and prediction value.
IV. RESEARCH METHODOLOGY To predict students' academic performance, the methodology suggested in this research follows five main phases include data collection, data pre-processing, subdatasets generation, classification algorithms application, and evaluation (see Fig. 1).

A. Dataset
Student data used in this research was obtained from the Deanship of E-Learning and Distance Education at King Abdulaziz University. Data include 241 records for undergraduate students were gathered from six different courses delivered from 2017 to 2019 in the Department of Information Systems, Faculty of Computing and Information Technology. Students' data include assessment grades and activity data on the blackboard. All students' data were extracted from the Learning Management System (LMS) into several Excel files. One file for students' activities on the blackboard and 26 files dedicated to the assessment grades data. www.ijacsa.thesai.org During the data collection process, the file that contains students' activity and their IDs is merged with the files that include student IDs and corresponding assessment grades. Then, data cleaned by deleting fields that have few entries and zero values. After that, data transformation is performed. Data transformation is a critical step to convert the data from the format of the source file to the format of the destination file. In this case, the created Excel file converted into (CSV) format then to (ARFF) format to be compatible with the WEKA software tool for data mining application.
The features extracted from the LMS include students' assessment grades and measurements of their online activity on the blackboard. These features are categorized into three major groups that are assessment grades, course access measurements, and mobile course access measurements. The description of these features and their type are shown in Table I.
In King Abdulaziz University (KAU), student performance is assessed using the course grading system. In this system, each course is given a sum of 100 marks distributed for the midterm exams, final exam, and course-work (e.g. quizzes, assignments, projects, and labs work). The final mark earned by a student in the course is corresponding to a letter symbol for the grade [37]. Hence, in this classification problem, students are classified into low-performing students who earned grades D+, D, and F, and high-performing students who earned grades A+, A, B+, B, C+, and C in the course.

B. Data Pre-Processing
Data pre-processing is an essential phase before classification algorithms application. In this research, preprocessing phase includes two steps are feature selection and sampling. After that, the results of feature selection and sampling algorithms are compared to find better algorithm to deal with the imbalance in the dataset and enhance the accuracy of classification algorithms.
Feature selection is applied to select a subset of features that have a greater impact on student academic performance. Moreover, the subset produced by feature selection allows classifiers to reach optimal performance and can be a helpful solution for imbalanced class distribution in the dataset [32,33]. Therefore, six different filter and wrapper methods are applied on student dataset. Three filter methods are applied include Correlation Attribute Evaluation, Information Gain Attribute Evaluation, and CFS Subset Evaluation [19,20]. Besides three popular machine-learning algorithms include Decision Tree (J48), Naive Bayes (NB), and K-Nearest Neighbor (IBK in WEKA) are used to implement wrapper method [21]. The results of these six feature selection algorithms show that assessment grades are the most important features that affect student academic performance.
Correlation and Information Gain algorithms give the same high ranking for six features that are assignments mark, final exam, second midterm exam, lab mark, quizzes mark, and first midterm exam. While CFS subset selects four features that are assignments mark, quizzes mark, second midterm exam, and final exam as highly influential features. The subsets produced by wrapper methods show that Wrapper-J48 algorithm selects two features are assignments mark and final exam. While Wrapper-NB subset includes four features that are assignments mark, first midterm exam, final exam, and assessment access. The Wrapper-IBK algorithm determines only one feature is the assignment mark as the most important feature.
For sampling, three algorithms are applied on students dataset include random over-sampling of the minority class (Resample), random under-sampling of majority class (SpreadSubsample), and synthetic minority over-sampling technique (SMOTE), which have been used in [30,34].

1) Comparison and evaluation results:
To compare feature selection and sampling algorithms, these algorithms are applied along with five classification algorithms include J48, RF, SMO, MLP, and Logistic. The performance of these algorithms is evaluated and compared using 10-folds crossvalidation and accuracy metric. Evaluation and comparison results of feature selection and sampling algorithms are presented in Table II and Fig. 2. Table II   Thus, there is no one feature selection algorithm obtains better accuracy results for all classifiers. However, it observed that the subset produced by the Wrapper-J48 performs better than other subsets, by achieving accuracy results above 97.00 when classified using J48, RF, and MLP. Table II  For both feature selection and sampling algorithms, Resample algorithm achieves the highest accuracy results with all classifiers, except SMO obtains the best accuracy with SMOTE. However, the use of SMO with Resample algorithm does not result in poor performance, but its performance considered better with SMOTE. Therefore, Resample algorithm is used to balance the dataset and create more accurate prediction models for students' performance. Hence, SMO is excluded and used the remaining four classification algorithms that are J48, RF, MLP, and Logistic for creating prediction models of students' academic performance.

C. Generate Sub-Datasets
To investigate the impact of student assessment grades and activity data jointly and separately. Students' dataset is partitioned into six sub-datasets based on the three major groups of features (in Table I). The generated sub-datasets are described in Table III.

D. Predicting Students' Academic Performance
After resampling and generate sub-datasets, sub-models are constructed in each sub-dataset displayed in Table III. Additionally, the base model is constructed using "All features" to evaluate the performance of the sub-models compared to the base model. These prediction models are created using four classification algorithms include J48, RF, MLP, and Logistic. The performance of base models and sub-models is evaluated and compared using different evaluation measures, which have been used in [6]. In this research, models are trained and tested using 10-folds cross-validation method [3]. In this method, the dataset is divided into ten equal subsets for training and testing. Each subset is run ten times, in each time 90% of instances are trained while 10% of instances are used for testing the model, tested instances in each iteration are different. Then the average of results is computed as the final result.

E. Results
To evaluate and compare the performance of prediction models, first, precision, recall, F-measure, Kappa value, and area under the ROC curve (AUC) are measured. Second, as a complement for previous measures, accuracy and the root mean squared error (RMSE) are measured [6]. All these evaluation measures are computed using 10-folds crossvalidation method for all classifiers, where the better results are boldfaced. Table IV shows the results of Precision, Recall, F-measure, Kappa value, and Area under ROC curve (AUC) achieved by J48, RF, MLP and Logistic classifiers for the base model and submodels. Results in Table IV show that created sub-models using "assessment only", "assessment + course access", "assessment + mobile course access" features and base model have the same high performance using random forest and J48 classifiers in terms of evaluation measures. For these submodels and base model, random forest achieves the highest result of 0.99, 0.98, and 1 in terms of f-measure, kappa, and AUC respectively. While J48 achieves the second highest result of 0.99, 0.98, and 0.99 for f-measure, kappa, and AUC, respectively. Moreover, the sub-model that generated based on "assessment + mobile course access" features outperforms to its base model and other sub-models when using MLP and Logistic algorithms. This sub-model with MLP algorithm achieves results higher than other models in terms of precision and kappa value of 0.99 and 0.97 respectively. Also, this submodel with logistic algorithm obtains results higher than other models in terms of precision, recall, f-measure and kappa value of 0.98, 0.98, 0.98 and 0.95, respectively.

1) Evaluate and compare the performance of sub-models to the base model based on Precision, Recall, F-measure, Kappa value, and Area under ROC curve (AUC):
However, the performance of created sub-models using activity data only (such as "course access only", "mobile course access only", and "course access + mobile course access") deteriorate comparing to performance of the base model. Hence, base model performs better than sub-models created based on activity data only by achieving values above 0.97, 0.94, and 0.97 for f-measure, kappa, AUC, respectively for all classifiers.
Among sub-models created based on activity data only, sub-model that represent the "course access only" features with random forest classifier achieves result better than other activity sub-models. This sub-model obtains values of 0.97, 0.94, 0.99 in terms of f-measure, kappa, AUC, respectively. Followed by the sub-model created using all activity data ("course access + mobile course access") with the random forest, it obtains results better than other sub-models of activity that have been created using J48, MLP, and Logistic algorithms. This sub-model obtains values of 0.96, 0.93, and 1.00 for f-measure, kappa, AUC respectively. Table V shows the results of accuracy and root mean squared error (RMSE) achieved by J48, RF, MLP and Logistic classifiers for the base model and sub-models. Results in Table V show that base model that represents "all features" with random forest superior to all other classifiers and models by achieving the highest accuracy value of 99.17. In addition, random forest ensures sub-models produced using "assessment only", "assessment + course access", and "assessment + mobile course access" features obtain high accuracy value close to 99.00 and low RMSE value of 0.06.

2) Evaluate and compare the performance of sub-models to the base model based on accuracy and root mean squared error (RMSE):
However, the two sub-models based on "assessment only" and "assessment + mobile course access" features with the J48 classifier outperform their base model and other sub-models by achieving the lowest root mean squared error (RMSE) value of 0.04 and accuracy value of 98.92. Moreover, the sub-model that generated using "assessment + mobile course access" features with MLP classifier outperform to its base model and other sub-models by achieving higher accuracy of 98.38 and lower root mean squared error (RMSE) value of 0.07. Also, the sub-model based on "assessment + mobile course access" features using the Logistic classifier outperforms its base model by achieving a higher accuracy value of 97.54. This research was to investigate the impact of assessment and activity data on students' academic performance. Therefore, students' dataset was analyzed using different feature selection algorithms to identify important features that affect their academic performance. Moreover, the base model and sub-models based on assessment and activity features jointly and separately were constructed. Also, the performance of used classification algorithms was compared to find the best algorithm for classifying student performance. www.ijacsa.thesai.org Feature selection results revealed the important features that affect student performance are assessment grades, especially assignments mark and final exam. Hence, this research corroborates with the finding reached by [11,13] they concluded that student performance significantly influenced by assessment data. In [13] they compared four feature sets include student and course characteristics, LMS features, and past performance including assessment grades. Their results demonstrated that student characteristics and the assessment grades had a larger impact on student performance than other features sets. While authors in [11] found a strong correlation between grades of assessments and examinations with students' final grades.
For the base and sub models, the experimental results showed that base model generated from "All features" dataset and classified using random forest algorithm outperforms other prediction models, by obtaining the best results for all performance evaluation measures, especially best accuracy value of 99.17. Moreover, sub-models that include the assessment grades separately or jointly with activity data obtain results better than sub-models rely on the activity data alone for prediction. Additionally, the findings reveal that sub-model generated based on "assessment + mobile course access" features perform well to predict student academic performance.
Researchers in [6] reported that the performance of created sub-models superior to the performance of the base model. In addition to the effectiveness of use students' sub-datasets to predict their academic performance. The results supports this fact to some extent, in terms of the usefulness of investigating the sub-datasets to predict students' academic performance and assess the impact of different features on their success in courses. However, the results revealed that base model based on all features and sub-models that included assessment data separately or jointly with activity features achieved high performance results. That indicates both base model and submodel perform well to predict students' academic performance.
Regarding the impact of assessment and activity data, results showed prediction models that include assessment grades separately or jointly with activity data have superior prediction results compared to models based on activity data alone. This finding indicates assessment grades affect students' performance significantly, while activity data alone has less impact. Hence, this research corroborates with the finding have been reached by researchers in [11], they revealed a strong relationship between students' online activities in the form of assessments and exams with their final grades in the course. Their finding indicates the importance of assessment data in predicting students' achievement in the course. In addition to the usefulness of investigating students' online activity to assess its impact on academic achievement. Furthermore, researchers reached a similar conclusion in [13]; they found past performance in the course (including assessment grades) and student characteristics have a greater impact on student performance, while LMS features had a lower impact. The experimental result support this fact, activity data alone have a lower impact on student performance compared to the assessment grades. However, assessment and activity data together enhance the accuracy of the prediction model. Hence, this finding demonstrates the importance of including the assessment grades with activity data for the prediction model of students' academic performance.
However, researchers in [11] concentrated on online assessments alone as indicators for student activity. Others in [12] investigated only one feature of online activity data which is time spent by a student on Moodle, while Moodle (or LMS) provides more features that can be investigated. Moreover, the dataset used in [12] included 22 instances only; which can be considered a very small number of instances compared to datasets used in previous works. However, this work studied more features of student online activity than those examined in [11,12], using dataset includes 241 instances. Also, this experiment studied the impact of students' online activity in other forms like course access measurements and mobile course access measurements as well as the assessment grades.
For classification algorithms, the experimental results revealed the random forest algorithm perform better compared to other classification algorithms. This finding is in accordance with findings reported by [12,24,2]; they also found random forest algorithm outperform other classification algorithms for student performance prediction using different features such as personal, academic, and activity data. Moreover, in this experiment, random forest algorithm ensures the highest performance results for base and sub models. Followed by decision tree algorithm by obtaining the second highest performance results. As random forest does not provide interpretable results, decision tree can be considered more useful.

VI. CONCLUSION
This research was to investigate the impact of assessment and activity data on students' academic performance. For this purpose, different feature selection algorithms were used to identify the important features that affect students' academic performance. Also, prediction models were constructed based on assessment and activity data jointly and separately using four classification algorithms that are decision tree, random forest, multilayer perceptron, and logistic regression.
Results of feature selection revealed that the most important features that affect student academic performance are assessment data, especially assignments mark and final exam. For prediction models, results demonstrated that both base model and sub-model perform well for predicting students' academic performance. Random forest outperformed other classifiers to predict students' performance by achieving the highest accuracy degrees for both base model and sub model, followed by decision tree. As the random forest does not provide understandable output, the decision tree can be considered more useful.
Furthermore, prediction models that included assessment data separately or jointly with activity data performed better than models based on activity data alone. This indicates that assessment data affect student performance significantly, while activity data have a lower impact. However, assessment and activity data together work better to enhance the accuracy of the prediction model. It is important to include assessment data www.ijacsa.thesai.org with activity data for the prediction model of students' academic performance.
However, certain limitations are observed in this research. The experiment was conducted using data of students for a specific department at faculty. Dataset had only 241 records and 19 features. These results might be different for another dataset with more records and other different features. Also, there might be a possibility of achieving more accurate results by other data mining algorithms.

VII. FUTURE WORK
In future work, this work can be further extended to predict students' academic performance using data from other faculties and different departments to generalize the results. Also, further work may visualize and interpret decision tree result to obtain understandable results help to support low-performing students. Moreover, the same features can be used with other data mining techniques such as regression to predict student final grade in the course, association rule to detect the relationships between students' final grade with their assessment and activity data.