Categorizing Attributes in Identifying Learning Style using Rough Set Theory

—In a learning process, learning style becomes one crucial factor that should be considered. However, it is still challenging to determine the learning style of the student, especially in an online learning activity. Data-driven methods such as artificial intelligence and machine learning are the latest and popular approaches for predicting the learning style. However, these methods involve complex data and attributes. It makes it quite heavy in the computational process. On the other hand, the literate based driven approach has a limitation in inconsistency between results with the learning behavior. Combination, both approaches, gives a better accuracy level. However, it still leaves some issues such as ambiguity and a wide of range of attributes value. These issues can be reduced by finding the right approach and categorization of attributes. Rough set proposed the simple way that can compromise with the ambiguity, vague, and uncertainty. Rough set generated the rules that can be used for prediction or classification decision attributes. Yet, due to the method based on categorical data, it must be careful in determining the category of attributes. Hence, this research investigated several categorizing attributes in the identification learning style. The results showed that the approach gives a better prediction of the learning style. Different categories give different results in terms of accuracy level, number of eliminated data, number of eliminated attributes, and number of generated rules criteria. For decision making, it can be considered by balancing of these criteria.


I. INTRODUCTION
The shifting paradigm in the learning process from teachercentered learning to students centered learning has changed the way of learning. Conventional ways of learning that emphasize one fits for all are no longer compatible with current conditions [1]. Each student has a unique and different way of learning. They have their own way and learning style. Moreover, the revolution of internet technology has provided various learning materials and media [2]. Technology attracts the student in a different way of learning. This situation has encouraged the development of learning models designed to follow students' personal needs in the form of learning personalization. In case of e-learning, the design of e-learning models that were initially technology-oriented and general in nature became more oriented to the needs, characteristics, situations, and conditions of students such as learning styles, prior knowledge, learning goals, cognitive abilities, learning interests, and motivation as parameters in learning [3]. Therefore, the identification of student learning styles is significant in the learning process.
The study related to the identification of learning styles is delivered in order to improve effectiveness and performance in learning [4]. However, the approach taken is still less efficient because it is done by conducting a series of questionnaires as well as an inconsistent result between the questionnaire and student behavior when the learning conducted. In general, learning style identification can be made through data-driven methods or literate based driven methods [5]. The data-driven method is conducted by transforming the questionnaire and using sample data sets to build a learner model. On the other hand, literate based-driven uses user behavior that provides instructional learning preferences when interacting with the learning environment [5]. Artificial intelligence and data mining methods are often used for the analysis of models based on a data-driven approach [6], [7]. On the other hand, a literate based driven approach uses a simple rule-based to compute the process of learning style models [8].
Both approaches have advantages and limitations. For example, using data-driven methods with sufficient data sets and appropriate methods are accurate enough to model learning styles. However, it is often encountered that is very complex with large enough data, so burdensome in the computing process. The efficiency advantage of a literate based driven approach is quite helpful in the computational process, although generally only suitable for modeling stationary and deterministic data. Whereas in learning style, there are often things that are dynamic, non-deterministic, and non-stationary [9].
Statistical modeling and individual machine learning are promising abilities for accurate predictive [10]. Integrating of two approaches using stochastic process and literate based driven has been conducted, but it still leaves some problems. This approach does not significantly distinguish learning styles. It has a similar distribution. So that it still raises ambiguity [11]. On the other hand, each attribute has a wide range of values. This requires an approach to be converted to a simpler range of values. Based on its characteristics, the rough set approach has the ability to resolve these issues. However, it needs to be further identified related to the process of categorizing appropriately for the existing attributes. Differences in determining categories can cause differences in the accuracy of predictions or classifications [12]. Therefore, this research will investigate several categorizing attributes and their effects on the level of accuracy in identifying learning www.ijacsa.thesai.org styles. This paper is organized as follows: Introduction (Section I), Theoretical Background (Section II), Research Methodology (Section III), Results and Discussion (Section IV), and Conclusion (Section V).

A. Learning Style
Learning style is a characteristic of cognitive, affective, and psychomotor behavior, as an indicator that acts relatively stable for students to feel interconnected and react to the learning environment. Learning styles are learning habits that are preferred by learners [13]. In studies related to personalization e-learning, learning style plays an important role in a model of e-learning personalization [14] [15]. This is to illustrate how the diversity of user conditions in learning results in different patterns of approach and learning preferences. Some are more interested in learning material in the form of text, video, audio, or pictures. In the meantime, maybe more interested in the way of presentation, such as in the form of concepts, examples, and other case studies. This shows that the learning process follows the needs of students in accordance with its learning style. So the learning style is one of the important components in the representation of learner e-learning personalization models [16].

B. Rough Set
Rough Set theory, firstly was introduced by Zdzislaw Pawlak in 1980, which was used to analyze of data classification in the form of information systems [17]. This theory uses a non-statistical data analysis approach. The purpose of the rough set analysis is to get a short estimate of the rule from a table or a data set. The results of the rough set analyses can be used in the process of data mining and knowledge discovery. Rough sets have been used widely in many fields such as medicine, pharmacology, economics, engineering, image processing, and decision analyses. The rough set is often used to modeling data with ambiguity, vague, and consist of uncertainty [18].
There are several important components contained in rough sets theory, namely: information systems and decision systems, indiscernibility relations, upper and lower approximations, discernibility matrix, data reduction, generated rules, and data prediction [19]. It can figure out information systems I=( ,Ω, , ) as follows [17][20]: : Universe set; Ω : Set of attributes; Ω = C∪D, is a finite set of condition attributes, and is a finite set of decision attributes; for each ∈ Ω, is called the domain of ; : an information function : → ;

C. Rough Set and its Application
Implementations of the rough set theory have been conducted in several studies. Bello and Verdegay presented the place of rough set in soft computing. The study combined the rough set with soft computing methods such as fuzzy logic, artificial neural network, and metaheuristics [21]. This hybridization approach succeeds in improving the performance of the system. Application rough set to identify behavior patterns of bank customer results 90% accuracy level. It is based on decision rule generation to predict the deposit nature of customers [18]. The rough set becomes effective tools for classifying 26 large scale construction projects in Iran and the other five countries. This classification is used to address the requirements and specifications of the construction project [22]. Korvin, et al. proposed the rough set theory to improve website performance by developing specific preloading strategy tuned to the needs of a web server. This approach was implemented due to the uncertainty of the internet user's behavior [23]. Another rough set application is used as one of the research methods to discover useful hidden patterns from fabric data to reduce the number of defective goods and increase overall quality. It is expected to improve the performance of manufacturing quality control activity and reduces productivity loss [24].

III. RESEARCH METHODOLOGY
This paper conducted several steps in order to achieve the research. The first step is data collection, then followed by categorizing conditional attributes, conducting rough set algorithm, identifying eliminated data, identifying eliminated conditional attributes, generated rules, and model evaluation.

A. Data Collection
In this research, learning style data was be obtained from an e-learning log server that involved 60 students who were taking IT Project Management Subjects. These students came from two classes. Although the data just have 60 records, but it has been taken repetitively every two weeks following the topic. They have been observed during ten weeks with five topics. Every topic provided four specific learning materials associated with the learning styles. Once the students visit a specific learning material, the counter will record duration visit (tMV, tMA, tMR, tMK). This first visit will also be recorded as a frequency of visit to specific learning material (fMV, fMA, fMR, fMK). If the student accesses the learning material at a different time, then the duration of the visit and the frequency of visit will also be added accumulatively. The data is recorded in an e-learning server log.
The conditional attributes consist of student learning behavior during interacting with the e-learning. The attributes are frequency visit and duration time of students when visiting specific learning material associated with a learning style. These conditional attributes can be shown in Table I. While the decision attributes consist of learning style based on VARK (Visual, Auditory, Read, Kinesthetic) that was introduced by Fleming [25]. This research used some learning material related to a specific learning style, as showed in Table II.

B. Categorizing Conditional Attributes
Data in conditional attributes are quantitative data. Frequency visit is measured by ordinal number, while duration time is measured in minutes. As required by rough set theory, the conditional attributes data should be converted to categorical data. In this research, the data is converted using categorizing criteria, as followed in Table III.

C. Rough Set Algorithm
This research conducted two main phases in implementing a rough set approach. These phases included eliminated unclassified data and conditional attributes reduction. The eliminated data follow the several steps [17]. After conducting several steps of the research, there are some results that can be achieved in identifying learning styles using rough set theory. Categorizing conditional attribute gives different results in terms of the number of eliminated data, generated rules for prediction, and accuracy level. It becomes interesting for further discussion. For processing data, this research uses a self-developed application. The application has the capability for converting data from quantitative to categorical, eliminated data based on the rough set, generate the rule, and conduct evaluation through accuracy level. The following sections will be provided complete results from two categories. Other categories are served in the results summary due to the processes are similar. www.ijacsa.thesai.org

A. Conversion Data to Categorical Data
The original learning style data consist of 60 data and eight attributes. The piece of data can be shown in Table IV. Based on Table V, it calculated some variables to determine the threshold for categorizing conditional attributes. These variables included mean, standard deviation (SD), minimum value, maximum value, and quartile (Q1-Q3). The variable value of conditional attributes can be shown in Table V and Table VI. Table IV is converted using two categories according to Table III and variable value in Table V -Table VI. The conversion process of data can be shown in Fig. 1. The process is the formula to categorize conditional attributes become two categories: High and Low.    Categorizing conditional attributes into two categories involved mean, minimum, and maximum value of each conditional attributes. The conversion result of conditional attributes using two categories can be shown in Fig. 2. It is shown that the original data with variation value of mean and standard deviation was converted become two simple categorical values High and Low. It can be inferred that the generated rule as the basis of model development will have a simple rule. However, it can contain the issue due to two categories showing a wide range of value.

B. Eliminated Data
As is mentioned in the previous section, data elimination is an important process in rough set theory [26]. By using the elimination process, the data becomes less than the original data, but it still gives the same results. Two steps in elimination are data and attribute reduction. These processes have a role in reducing redundant data or eliminated un-classified data. This is in line with the fact that many data sets are quite large, but not all of them can be used in forming models in decision making. It is possible that two or more conditional attributes have the same value but are inconsistent in their decision attributes. Consequently, the research on reduction attributes is an interesting and promising field. In this research, these reduction processes followed algorithms that have been figure out in the previous section.
Based on the algorithm, the screenshot code for eliminated data and attributes can be shown in Fig. 3 and Fig. 4. The data elimination algorithm is used to reduce the data which are identified in un-classified data. Explanation about un-classified data was delivered in previous section. Meanwhile, the conditional attributes elimination is used to reduce the attributes which are not affect to decision attributes. Both of algorithms were implemented using several categories as proposed in this research.
By using two categories, the eliminated result can be shown in Fig. 5. In this elimination process, 42 data have been successfully reduced. From the 60 original data, only 18 data were retained in building the prediction model. This information shows that with these two categories, many data are eliminated.

C. Generated Rules
In the previous section, it was shown that the elimination process carried out with two categories left 18 out of 60 original data. Based on this data, there are 13 generated rules that will be used as a basis for making predictions or classifications. If it is viewed from the computational process, these results show simplicity in the model. In the process of computing, this condition will make the process lighter. However, it is also important to look at the level of accuracy produced as a basis for further evaluation. The generated rule of the research by using two categories can be shown in Fig. 6.
The result of the generated rule also shows attributes that are used in predictions based on previous processes. These results also provide information on how complex the model is that it impacts the computational process. The number of simple rules with high accuracy can be used as an important reference in the selection for an optimal identification model.

D. Evaluation
The final stage of the learning style identification process is to evaluate the resulting level of accuracy. Previously, several criteria had been stated in the form of the amount of data eliminated and the number of rules generated in building the model. In this study, the resulting accuracy level was 96.67% as shown in Fig. 7. This level of accuracy is quite high, especially when compared to the data involved, and the rules are quite small.

E. Result Summary
The results of categorizing the attributes with the two categories have been presented in the previous section. Furthermore, the results of the identification of learning styles with the number of categories 3, 4, and 5 are presented in Table VII. Based on these results, category 3 produces 41 reduced data. This result is almost the same as the categorization of two categories. But with a higher number of generated rules, that is 16 rules. This three-category model, after evaluation, gives an accuracy rate of 93.33%. Category 4 provides 0 reduced data, 41 generated rules, and 100% accuracy. The results show that the number of rules produced provides maximum accuracy. Nevertheless, there are still issues related to processing computing with many rules. On the other hand, the process of reducing attributes with this category does not produce reduced data. Category 5 provides a maximum accuracy rate of 100%, 36 generated rules eight reduced data. These results are simpler in rule and reduced data compared to category 4. The results of learning style identification with the categorization of these attributes require further analysis as consideration for decision making. As a comparison, the proposed method has been implemented in flow experience data set. This data set has 92 students. The similar result from the data set can be seen in Table VIII. The results show that the highest number of generated rules tend to give the highest accuracy level. Yet, a few generated rules such as in category 5, it still gives the high accuracy level. The balance between the criteria can be used as a foothold to choose the most optimal categorization to be used in the model.

V. CONCLUSION
Categorizing attributes in identifying learning styles has been presented in this research. Implementation of rough set theory with various categories in learning style gives high accuracy. The results provided alternatives that can be used in decision making process. It has a different number of rules and eliminated data. Some categories give the highest number of eliminated data with the lowest number of generated rules. But, the accuracy level is lower than the others. Other categories yield the highest accuracy level, but they have a minimum number of eliminated data and the number of generated rules. For decision-making purposes, it can be done by balancing three criteria: generated rule, number of eliminated data, and accuracy level. Each of the criteria has consequences, especially in the computational process. In this research, categorizing attributes used basic statistic descriptive parameter with normal distribution approach. It can be a limitation of the research. For future work, it can be investigated about the distribution of data before the categorizing process.