Research on Students' Course Selection Preference based on Collaborative Filtering Algorithm

—Due to the events caused by the COVID-19 pandemic, the education industry is no longer limited to offline, and online classroom education is widely used. The rapid development of online education provides users with more abundant educational course resources and flexible learning methods. Various online education platforms are also constantly improving their service models to give users a better learning experience. However, at present, there are few personalized information recommendation services in student course selection. Students receive the same course selection information and cannot be "tailored" according to their specific preferences. This paper focuses on the integration of collaborative filtering technology into a college course selection system to construct a rating matrix based on students' ratings of the courses they take through correlation between courses and correlation between students. Based on the collaborative filtering algorithm, a predictive rating matrix is generated to produce a recommendation list to achieve intelligent recommendation of suitable courses for students. The experimental results show that, based on the traditional collaborative filtering recommendation technique, the improved collaborative filtering algorithm based on both item and user weighting is used to achieve course recommendation with higher recommendation accuracy. The application of the improved collaborative filtering technique in the course selection recommendation system of colleges and universities is very good at recommending courses for students intelligently, and the recommended courses for students have good rationality and accuracy, and achieve more intelligent course selection for students, which has great practicality and practical significance.


I. INTRODUCTION
With the rapid development of the Internet industry, the application of information technology is becoming more and more widespread in the management of academic affairs information, and the online course selection system has become a significance part of the management of academic affairs information. How to quickly find out the course you are interested in among the large amount of optional course information has become one of the research hot-spots of the course selection system. In the course selection system, most of them are based on the system search engine to query course information and take courses [1]. When facing a large number of available courses, students do not have the relevant knowledge as a basis to select courses because they do not know enough about the course information, which will lead to a waste of course resources. Currently, most of the course selection systems have no recommendation function or low quality of personalized recommendation, so it does not recommend the courses that students may be interested in. However, building a recommendation system based on feedback information such as students' major information and students' ratings of course information can solve these problems [2].

A. The Statement of the Research Problem
Elective courses for college students are courses that students can choose to study independently according to their preferences. Through elective courses, students can expand their knowledge. At present, the course selection system of most colleges and universities lists all the elective courses offered this semester in the system like commodities for students to choose [3]. The relevance and orientation between courses are poor. When college students choose courses, there are the following three problems: 1) Blindness: Students do not understand the relevance of courses and the direction of majors,Course selection is arbitrary.
2) Poor purpose, dealing with errands: Students only choose some courses that can easily pass in order to complete their credits, regardless of whether the courses they learn are helpful to their curriculum system. As a result, after the course is selected, the learning enthusiasm is poor and the learning effect is not good, which does not achieve the expected effect of elective courses(R. N. Behera and S. Dash,2016).
3) Instability and potential risks of course selection system: The traditional course selection mode has strong time constraint and does not take into account algorithmic fairness, which may often cause peak access and easily cause hidden danger to the security operation of the back-end system. Therefore, personalized recommendation technology is applied to the course selection system to provide students with personalized elective course recommendations according to their needs and interest preferences, prevent students from choosing courses blindly, and greatly improve the utilization rate of university elective course resources and the operation efficiency of the course selection system.

B. Research Objectives
1) To analyzes and compares several recommendation algorithms, finds out their shortcomings and advantages, and determines the research idea of applying collaborative recommendation technology to course selection system. www.ijacsa.thesai.org 2) By understanding the courses that students have taken and their evaluations, analyzing students' preferences, and pushing courses that target students may be willing to take.
3) To apply collaborative filtering technology to the field of students' course selection, and combine it with the basic course selection system to realize an efficient and convenient Personalized Course Selection recommendation system.

C. Research Question
The recommendation algorithm based on collaborative filtering is based on the similarity measure of individual behavioral characteristics. By calculating the similarity between the specified new sample (students) and the original sample in the database, the new sample is clustered, and the individuals with similar behavioral characteristics to the new sample are identified as the nearest neighbor samples (nearest students). After that, the selection set of the nearest neighbor sample is generated, and the selection set is sorted according to preference scores.
Ultimately, course selection recommendations are made to students based on the similarity of the new sample to the nearest neighbor students and the course selection preferences of the nearest neighbor students. There are two important issues that need to be addressed.

1)
First, how to calculate the similarity between students through behavioral data, and require that the similarity reflect the interests and learning characteristics of students strengths.
2) Second, after identifying a new sample of nearneighbor students, how to determine the set of recommended courses to be selected based on the near-neighbor students' course selection records.

D. Rationale of the Study
In the field of teaching management practice of colleges and universities, in order to follow up the reform of higher education teaching and meet the demand of society for comprehensive and practical talents, the course selection system of China's colleges and universities has completed the conversion from academic year system to credit system, and schools provide students with a large number of elective courses, and students in colleges and universities are given more options to choose their favorite courses according to their interests. With the diversified development of the society, people are more and more concerned with a wide range of fields, and students' interests show a trend of divergence, so colleges and universities have also opened corresponding courses for students to choose in response to this phenomenon. However, in recent years, there are too many elective courses in colleges and universities, which lead to information overload when students choose courses, and it is difficult for students to choose courses that suit their personal development due to the structural deficiencies of course classification and specialization. This shows that the course selection process of students is not without research value, and there is a pattern of course selection behavior. Moreover, the traditional course selection mode has strong time constraint and does not take into account algorithmic fairness, which may often cause peak access and easily pose hidden risks to the security operation of the back-end system. Therefore, personalized recommendation technology is applied to the course selection system to provide students with personalized elective course recommendations according to their needs and interest preferences, prevent students from choosing courses blindly, and greatly improve the utilization rate of university elective course resources and the operation efficiency of the course selection system [4].
The "department store" approach of simply improving the quality of teaching resources by simply listing them and letting students "pick and choose" is obviously no longer in line with the current requirements for "personalized learning".
Personalized learning requires adding a number of "shoppers" to the "department store" with a wide range of elective courses to help learners get the right course for them in a timely and accurate manner that is recognized by the learners. This "shopper" is the role that learning paths play in the learning process of learners, aiming to improve the precise guidance of learners, reduce the blindness of learners, and improve the efficiency of course selection [5].

E. Research Gap
At the same time, with the digital reform of higher education, some scholars have started to study the mining of student's one-card data to analyze student behavior. The students' one-card accumulates a large amount of student spending data and daily behavior data. However, there are still many shortcomings in the above-mentioned research on course selection recommendation systems. The recommendation systems based on students' course selection data use very limited course selection data of varying quality, which makes it difficult to accurately mine students' preferences, and they focus too much on the algorithm level, trying to copy the success of recommendation systems in e-commerce and entertainment fields to the education field, using various methods to improve the accuracy of the algorithm, while ignoring the characteristics of the education field itself and the limitations of the scoring matrix itself, resulting in no qualitative improvement in accuracy and hardly satisfactory recommendation results.
Web mining and bibliography mining are the theoretical basis of data mining technology in library user behavior analysis. Based on the research and analysis of library patron behavior composition and acquisition, library user behavior models are constructed by using machine learning and other algorithms. Through these models, we can understand the interest preferences of reader groups. However, there are still many shortcomings in the above-mentioned research on course selection recommendation systems. The recommendation systems based on students' course selection data use very limited course selection data of varying quality, which makes it difficult to accurately mine students' preferences, and they focus too much on the algorithm level, trying to copy the success of recommendation systems in e-commerce and entertainment fields to the education field, using various methods to improve the accuracy of the algorithm, while ignoring the characteristics of the education field itself and the limitations of the scoring matrix itself, resulting in no qualitative improvement in accuracy and hardly satisfactory recommendation results. www.ijacsa.thesai.org This paper proposes a collaborative filtering algorithmbased course selection recommendation system, which no longer pursues excessive algorithmic complexity, avoids the limitations of the scoring matrix itself, and realizes personalized course recommendation.

II. LITERATURE REVIEW
Collaborative filtering is a push technology that is often used to achieve the basic recommendation push function for the system. It is mainly to divide users into different sets by different tendencies, and to push items to target users according to the items that are closer to users' preferences, item's comment information, etc. as the basis for judging.

A. Theoretical Background
Collaborative filtering is to mine a small number of students with similar course preferences to the specified students in a large amount of data, and then designate these similar students as Then, we organize the course preferences of the near-neighbor students into a catalog sorted by preference, and finally recommend courses to the specified students based on the course preferences of similar students.

1) Similarity measure:
The similarity measure between samples is the basis of cluster analysis. When doing classification, a sample is usually considered as 1 vector in an n-dimensional Euclidean space, so the similarity between 2 vectors in n-dimensional Euclidean space can be measured from the following 2 perspectives. One is from the fish degree of vector distance. Second, from the angle of vectors. In particular, since the Euclidean distance in high-dimensional space still satisfies the triangular inequality of distance, the Euclidean distance is the most common method to measure the vector distance in high-dimensional space [6]. a) Euclidean distance metric: In high-dimensional space, the Euclidean distance is a measure of the distance between points in vector space that is closest to the intuitive meaning of distance in three-dimensional space. By introducing the concept of Euclidean distance; the vector space has the concepts of length and angle. Suppose the samples x and y are 2 points in an n (n ≥ 1) dimensional Euclidean space. [ Then the Euclidean distance is calculated as follows.
Manhattan distance is also called city block distance assuming that the sample x and y are 2 points in n (n ≥ 1) dimensional space, we get equations (1) and (2), then Manhattan distance is calculated as follows.
c) Chebyshev distance metric: Chebyshev distance is also an important measure to define the distance of points in vector space, which takes the maximum value of the distance in the component dimension as Chebyshev distance. Specifically, assuming that the samples x and y are two points in an n (n ≥ 1) dimensional space, equations (1) and (2) are obtained, and the Chebyshev distance is calculated as follows | | (5) d) Minkowski distance metric: The Min distance, sometimes referred to as the space-time interval, was first expressed by the Russian-German mathematician H. Minkowski . Assuming that the samples x and y are two points in an n (n ≥ 1) dimensional space, equations (1) and (2) are obtained, and the Minkowski distance is calculated as follows.
| | From the calculation formula, Min's distance treats the components as obeying the same distribution, and also disregards the difference of the components in the magnitude. e) Standardized Euclidean distance Metric: Similar to the Min distance, the simple Euclidean distance also suffers from the problem of treating the components as obeying the same distribution, and ignores the differences in the components in terms of mean and variance. By first standardizing the components and then calculating the Euclidean distance, an improved standardized Euclidean distance is obtained. Assuming that the samples x and y are 2 points in an n (n ≥ 1) dimensional space, equations (1) and (2) are obtained, and the standardized Euclidean distance is calculated as follows.
f) Angle cosine metric: The angle cosine is a measure of the similarity of sample points from the directional point of view, and is widely used in many fields. Suppose the samples x and y are two points in n (n ≥ 1) dimensional space, and equations (1) and (2) are obtained, then the vector angle cosine is calculated as follows: From the above equation, the absolute value of the cosine of the vector angle is less than or equal to 1. Its magnitude can reflect the similarity of the two vectors, and the larger the value, the higher the similarity of the two vectors [7]. Moreover, a positive value of the cosine of the vector angle indicates that the two vectors have an isotropic relationship, and vice versa, it indicates that the two vectors. The opposite indicates that the two vectors are negatively related [8]. Considering the similarity measures and the research needs of this paper, the similarity of students is calculated by using the similarity measure based on the cosine of the vector angle.
2) Student-based collaborative filtering: In the recommendation system, a sample of 343 senior students and their course selection records from the previous year at Liaoning National Normal College in China was selected as (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 13, No. 5, 2022 727 | P a g e www.ijacsa.thesai.org the basis for this study and the similarity between classmates was calculated by referring to the angle cosine similarity measure. For the purpose of analysis, here is an example of seven students' collaborative over filtering process [9].
Suppose there are seven students, A, B, C, D, E, F and G, who choose the courses they want to take among five courses, a, b, c, d and e. Their course selections are as follows as shown in Fig. 1.
Based on the above students' course selection and the pinch cosine similarity measure, the similarity between students was calculated [10]. For example, the similarity between A and B is shown as below: The similarity between A and C is Similarly, the similarity between A and D is √ , the similarity between A and E is , the similarity between A and F is √ , and the similarity between A and G is Considering that the above algorithm needs to calculate the similarity between the specified sample (student) and any other sample, in order to improve the computational efficiency of the algorithm, the following improvement scheme is proposed [11].
Step 1: Create a course-to-student reverse lookup table.
Step 2: Build the student's congruence matrix based on the backwards checklist Based on the selected status of each course in the sample, a backward checklist is created as shown in Fig. 2.  Based on the inverted table in Fig. 2, the same present matrix was generated for different students, and the results are shown in Table I.
The co-occurrence table was normalized and the results are shown in Table II.
3) Collaborative course-based filtering: When designing a recommendation system based on collaborative filtering, it is necessary not only to find out the near-neighbor students of a given student based on behavioral data, but also to evaluate the similarity of courses In evaluating the similarity of courses, the algorithm uses students' ratings of courses to evaluate the similarity between courses [12]. In evaluating course similarity, the algorithm uses students' ratings of courses to evaluate the similarity between courses. To demonstrate the algorithm flow, the analysis is still based on the course selection records of seven sntudents, A, B, C, D, E, F, and G, in five courses, a, b, c, d, and e. The records are analyzed, and their course selections are shown in Fig. 3. Then, the co-occurrence matrix of the course is normalized. The formula is as follows: The results of the treatment are shown in Table III. After obtaining the students' ratings of the courses, and the similarity between the courses, estimating the preference of students u for the alternative course o [13].
The formula is as follows.
In Equation (14), the measure is the degree of preference of student u for course o. Calculating the degree of preference of a given student for all alternative courses is the basis for constructing the student course selection recommendation system [14].
In addition, C(0, N ) in equation denotes the set of courses similar to the alternative course o, represents the similarity between the alternative course o and course i, and is the degree of preference of student u for course i calculated based on the nearest neighbor students [15].

B. Overview of Recommendation Systems
Currently, general education platforms use the course recommendation method based on data statistics, ranking the platform courses based on the number of course selections, and recommending the courses that most platform users are interested in, i.e. the popular course selection list. Among the current recommendation systems, collaborative filtering-based recommendation algorithms are the most widely used [16].The Recommended methods have their own advantages and disadvantages, in order to better understand their interrelationship and their respective advantages, the two types of algorithms are compared as shown in Table IV.
There are two types of collaborative filtering methods commonly used.

1) user-based collaborative filtering (User-based CF)
algorithm, which judges the user's favorite degree of the project through the user's historical behavior, calculates the relationship between users according to different users' preferences for the same project, and recommends the project among users with the same preferences [17].
2) Item-based algorithm focuses on Item, whose similarity is mainly based on its inherent feature values, so it can be classified according to its feature values, calculate the proximity between them, and give suggested results. Since the classification of item is more stable, it can be pushed offline [18].
The comparison of user-based and item-based advantages and disadvantages, and the scope of application have been summarized as shown in Table V.   TABLE IV.

Recommended method Advantages Disadvantages
Collaborative filtering recommendations

C. The limitation of Previous Studies
The current course selection recommendation system in universities is separate from the online course system. The vast majority of scholars have done a lot of complex work and research in the direction of personalized social network recommendations, e-commerce product recommendations, user emergence mining models, personalized recommendation videos, learning resources, and student user preferences, user profiles, deep recommendation algorithm models, etc.
However, there is very little content related to students' real course selection recommendations, as each school has a different course selection platform. Most schools are concerned about how to complete the task of course selection more easily and quickly, and guarantee the stable operation of the system's course selection platform by randomly drawing lots, or by centrally opening the platform for students to grab courses within a fixed period of time, and do not consider the satisfaction of students' diverse needs through a simple and brutal way.
Algorithms are at the core of personalized recommendation and collaborative filtering techniques. However, how to develop effective and accurate evaluation criteria for the recommendation results of algorithms is an issue that deserves constant attention both for academia and industry. Different evaluation criteria have different focuses, and a single evaluation criterion generally only evaluates a certain aspect of the algorithm, which is more or less deficient. Therefore, how to choose appropriate evaluation metrics to evaluate the recommendation results has a crucial impact on the development of the whole personalized service field [19].

III. METHODOLOGY
Combine course selection function and collaborative filtering and recommendation technology to realize intelligent course selection function. To analyze the trend of students' interest through their information in the system, such as courses taken, grades, ratings, comments, etc., and give them a list of courses. Focusing on the application of pushing process, user-based and item-base are applied to the selection of courses with student and course as the main objects of study. Since the traditional algorithms have some shortcomings, this paper applies item-based weighted and user-based weighted collaborative filtering algorithms to the course selection push system to improve the accuracy. In the design of the course selection push system, consider the major attributes of students, the relevant attributes of courses and the ratio of students to courses to ensure that the recommendation system can push the set of courses that students are really interested in [20].
A. Data Collection 1) Selection of data set: In this paper, we use the data set of the academic system of Liaoning National Normal University to implement the traditional pushing process. A part of the information is extracted as the initial data, and 238 students' evaluation information is obtained. The records were tabulated into students, courses, and ratings tables. Each data file contains the following details.
 Ratings： Stu-number、 Course ID, Rating,and Times.
The data set is the basis for implementing the course selection push function. Based on the students' course selection and course rating over a period of time, the students' interest level in various elective courses is analyzed and expressed by rank. In this paper, we use a two-dimensional matrix to represent the student's interest in a course, i.e., a v × w student-course favorite table vw, where v represents the number of students and w represents the number of selected courses. vw value represents the vth student's interest in the wth course. This matrix can be explained by Table VI. Course (course-name, course-id, course-time, score, grade, course-type), such as course (Java programming core technology, F1025, 36 hours 90, 19computer, elective). www.ijacsa.thesai.org

C. Research Procedures
The basic framework of intelligent course selection push system is shown in Fig. 4.

1) Basic idea:
Consider both course-weighted and studentweighted, and use item-based weighted user cosine similarity and user-based weighted user collaborative filtering to predict the evaluation and improve the accuracy of recommendation [21].
2) Description of the algorithm process: Input: target student T, course evaluation form R, course characteristics form A, number of neighbors k; Output: Top-N recommendation set of target student T; Step1: Find the set of graded courses of all students, courses and target students T in the system from the course grading table R, and denote them as U m , I n, I T [22].
Step2：The top k students with the highest median value are selected as the closest set of students to student T according to the influence of different course weights to obtain the weighted cosine similarity N T ={ j 1 ,j 2 ....,j k }.
Step3：For any ungraded course i of the target student T, a weighted approach is used to combine the predicted evaluation of student impact with the predicted evaluation of the student's historical evaluation.
Step4：Select N courses with higher predicted scores as the push result set for target student T.

IV. ANALYSIS AND RESULTS
This chapter is the realization of the system function based on the description and design of the course selection system, then expounds the implementation effect of the recommendation function of the system, and effectively evaluates the recommendation function of the main program of the system according to the evaluation index.

A. Experimental Information
The computer test environment of this machine is shown in Table VII.

B. Data Conversion
The item based weighting and user based weighting recommendation methods are adopted. The item based weighting will affect the proximity value and nearest neighbor selection, and the user based weighting will affect the prediction and evaluation. It is applied to the university educational administration course selection push system to realize the purpose of intelligently pushing elective courses for students. First, get the original data from the database, sort out the original data, retain useful data, delete irrelevant records, and improve the efficiency and accuracy of the algorithm. The data involved in the algorithm studied in this paper are from the educational administration system of Liaoning National Normal University. The data will be processed separately to meet the requirements of the algorithm [23].
From the database, 7 data tables related to students' previous course selection and evaluation data of students' courses are selected. There are 8 tables, from which students' attributes, course selection attributes and students' teaching evaluation information can be obtained. From the interview table collected from students, it can intuitively obtain the students' interest in course in some directions [24]. Obtain the student number, name, course name, major, score, evaluation and other records useful for the algorithm. After re integrating the records, establish the correlation between tables and rewrite them into the database. See Fig. 6 as follows. www.ijacsa.thesai.org

C. Accuracy Comparison Results
Experiments show that the improved algorithm can improve the push accuracy. During the experiment, the data comes from.
The data of Liaoning National Normal University educational administration system and the information is collected by questionnaire survey. The test set takes three tenths of the information, the training set takes seven tenths of the information and arbitrarily turns it into five parts, which are expressed as data set 1, data set 2, data set 3, data set 4 and data set 5, respectively. The accuracy flow of push algorithm is shown in Fig. 7.

V. DISCUSSION
In order to test the prediction accuracy of the recommendation algorithm, the course data set is used for offline calculation, the student behavior model is established on the training set to predict the student behavior on the test set, and the score prediction accuracy is calculated through the root mean square error RMSE. The detailed experiments are as follows:  The experimental design uses different data to test the influence of algorithm convergence, and takes the course data set to test, including 343 students and 40 course scores. The data set is divided into two parts: 80% training data and 20% test data.
 Experimental data: under the condition of comprehensive prediction accuracy and efficiency, the experiment carries out three iterative tests of 10K, 100k and 1m data: 5, and 10. The RMSE data is shown in Table VIII.

A. Experimental Analysis
Through the experimental analysis, the following results can be obtained. When the number of iterations increases, the prediction accuracy of RMSE will decrease, and each doubling will decrease by 0.04, indicating that the increase of the amount of data will not significantly reduce the performance of the recommended algorithm, and the increase of the amount of data will make the convergence effect of the algorithm better [25]. At the same time, it can be obtained from the analysis that when the amount of data is the same, the choice of K value affects the prediction accuracy, that is, the smaller the K value, the higher the accuracy, and the larger the K value, the lower the accuracy. Therefore, K is selected as 5 as the number of iterations. The collaborative recommendation algorithm combines the characteristics of User-CF and Item-CF algorithms, and filters the students' information with tags at the initial stage of recommendation, which reduces the search scope to a certain extent. When recommending, filter the initial results of User-CF recommendation, and then make the final recommendation according to the score, so as to make the recommendation effect better, as Fig. 8 and 9.
After many times of verification and improvement, on the whole, the system meets the design requirements. The accuracy of the core algorithm in the recommendation function is evaluated. The experimental results show that the algorithm can produce.

VI. CONCLUSION
In this paper, is used the collaborative filtering algorithm to study students' course selection preferences based on their course selection history data, and build a university course selection recommendation system based on this algorithm. A university course selection recommendation system is built based on this algorithm. The experimental results show that the collaborative filtering algorithm is able to mine and extract the attributes, behavioral characteristics and preferences of students, which can help solve the problems of students' course selection. It can help solve the problem of students' randomness and blindness in course selection, and improve the management efficiency of university education and teaching.
In particular, this paper shows that the collaborative filtering algorithm-based student course selection recommendation system can better balance the relationship between student characteristics and course characteristics, and can help achieve the optimal matching between student learning ability and course requirements, student course selection preference and course characteristics, and student ability development and career requirements.

A. Innovation Points
Through the shortcomings of the existing course selection system and the urgent need for an intelligent course selection system, we select the existing course selection system of Liaoning National Normal University and analyze it. Then, introduce collected students' course selection data from Liaoning National Normal University's academic affairs system, pre-processed the data, and compared the user collaborative filtering algorithm based on user weighting with the traditional filtering algorithm through experiments, and the improved way considered the weights of student and course, verified that it can improve the recommendation pushing accuracy in a certain range, and the better pushing result is achieved.

B. Future Work
When looking for nearest neighbors, it can consider the fusion of local nearest neighbors and global nearest neighbors. Global nearest neighbors, local interests are not similar, local nearest neighbors, and global interests are not similar. Through the optimization algorithm of similarity, the recommendation accuracy can be improved, the data can be used to a greater extent, and the error problem caused by sparse data can be improved. Following problems need to be further studied in the future to supplement and improve the paper: How to make the intelligent course selection recommendation system more accurate and more real-time, and consider the time complexity and space complexity of the algorithm to make the efficiency better?