Book Recommendation for Library Automation Use in School Libraries by Multi Features of Support Vector Machine

This paper proposed the algorithms of book recommendation for the open source of library automation by using machine learning method of support vector machine. The algorithms consist of using multiple features (1) similarity measures for book title (2) The DDC for systematic arrangement combination of Association Rule Mining (3) similarity measures for bibliographic information of book. To evaluate, we used both qualitative and quantitative data. For qualitative, sixty four students of Banpasao Chiang Mai school reported the satisfaction questionnaire and interview. For Quantitative, we used web monitoring and precision measures to effectively use the system. The results show that books recommended by our algorithms can suggest books to students “Very interested” and “interested” by 14.5% and 22.5% and improve usage of the OPAC system's highest average of 52 per day. Therefore, these systems suitable for library automation of Thai language and small library with not much book resource. Keywords—Library automation; book recommendation system; library integrated system; title similarity; support vector machine; open source


I. INTRODUCTION
In Thailand, libraries are the main source of knowledge in education institutions which provide resources such as books, journals, research papers, interactive media, etc. in order to support learning for people and students. In addition of Thailand , There are five types of library such as school Library, college and University , public Library , special Library and National Library which are required many librarian to management the resource such as adding new member , book data , catalogue book , giving support to clients, organizing all relevant information about books, etc. Thus , In order to support librarian, Software of library automation or Integrated library system [2] [3] are the concepts of using information and communications technologies (ICT) which are designed to support and management all of manual processing task in library in order to reduce the duty of librarians to manage library work to maximum effective.
In recent years, Artificial intelligence is usually becoming a part of everyday life that impacts the way of knowledge of technology and the world. In addition, for the e-commerce sector, recommendations technique has been used widely in information agents that attempt to predict and suggest which items from data collection to user who may be interested in and recommend favourite thing to them. The idea of technic is detecting the information process obtained by user's interaction or need according to collaborative behavior [1], the algorithms of recommendations technique represent as desirable items filtering by system from the user's past behavior which usually record and compute by purchased or selected, items previously, ratings given to those items.
However, one of the most important tasks of a librarian is to recommend interesting books for users [5] because users are not familiar with the library and lack of knowledge in accessing information in the library. In library automation, there are limitations of data from users' activities because there are different characteristics of information that of amazon books selling e-commerce such as amazon, lazada or shopee. There are two main reasons for difference of technique between library system and other recommendations system. First, the users in the library system have different actions to do, thus users do only search on OPAC (Online public access catalog) [4] no review or any score to book. Second, the record of activities of the user only has a book's loan which cannot be enough data to analyze the data to calculate the recommender system model. Therefore, as we mentioned before for limited data for processing recommendation book, in this research we proposed the method to using a combined method applied for library recommendation systems.
To implement the recommendation system in library automation, it was necessary to apply the public interface of the user in the library main portal for search and retrieval of information in the OPAC module. In our scope, we develop open source library automation based on a small library [6] which has no more 2000 resources. For the small library in Thailand, it is defined as a library school which has more than 46000 places [7] in Thailand which can gain the benefit of our work. In addition, open source library automation was used widely in Thailand such as koha, Evergreen or openbiblio OBEC and WALAI AutoLib due to can be used without cost. In the field of recommendation systems of automation library , there are few researchers on automation library for example [8] using collaborative filtering based on the library loan records to recommend system. [10] also using single information to compute the users behaviors on a large scale book-loan logs of a university library. In another approach, [9] combine multi source data from book titles and loan logs with more accurate results for personalized recommendation.
190 | P a g e www.ijacsa.thesai.org In our studies, we combine based on multi source of information among titles, Dewey Decimal System [11] for book classification system and book-loan logs. The multiple data were implemented by support-vector [12] machines model in order to calculate similarity between the outlines in the BOOK Database. In our results, the three features were compared and conducted against three: loan logs, bibliographics similarity and combination of title.in this study, the information was employed from a small library in a private school which consists of a resource of material book 9654 bibliographics. Therefore, we proposed the recommendation system book based on combining multiple source data of title similarity bibliographic data, title, publisher and author. Finally, our research is structured as follows. Sections 1 and 2 are introduction and related work with some background on the principle of recommendation system on library automation.in Section 3 proposed approach to describe the algorithms and overview of our open source library automation. Section 4 is devoted to sum up the experiment result on our algorithms.in the end, Section 5 conclusions of this work and future work.

II. RELATED WORK
The research of book recommendation systems has been proposed rapidly due to the web technology adapt for digital library modernization. However, due to the context of library automation which has limited data from users to training in modern machine learning algorithms and different usage of behavior for using in recommendation system. Thus, few researches focused on implement the algorithms based on library load record, similarity of book titles or association rule method linking with book category. Many book recommendation systems have been developed such as LIBRA [26] used book information from amazon book data with applied text categorization to semi-structured data. K3Rec [27] was developed for k-3 readers. The idea was to find similar book content, book catalog suitability and other features to recommend book for children. Nova [28] used combine between collaborative filtering and content based filtering call hybrid method to find personalized book. Another researcher focus on book loan, [25] [28] used FUCL mining technique based on association rule mining with linking of book loan information. The authors in [8][9] [30] used book loan records from 39,442 users to combine between book loan information , association rule of book category and Matches/mismatches on Nippon Decimal Classification. [31] applies the information learned of an author-identification model by using convolutional neural network of book similarity.

III. PROPOSED METHODOLOGY FOR RECOMMENDATION SYSTEM
The purpose of this study is to develop the open source library automation implementation with a recommendation system which could suggest personalizing book favorites in order to increase user satisfaction and reduce the task of librarians. However, in this study we focus on algorithms in recommendation systems based on library automation only. Fig. 1 shows a brief detail of our open source library automation name as Angkaew autolib [13], the name of Angkaew refers to the name of a famous lake situated at the bottom of Suthep mountain in Chiang Mai University.
Angkaew autolib was develop based on core architecture of OpenBiblio the automated library system open source framework (http://obiblio.sourceforge.net) with consist of basic module for library automation such as OPAC, circulation, cataloging, and staff administration functionality. Angkaew autolib was develop by add on the feature of support Thai language , some user interface for librarian in Thailand and compatibility with modem programming with support php 7.0.

A. The Character of School Library Users
The system aims for the school library and small library. However, as we mentioned earlier the recommendation in library automation may have different characteristics since users are mainly focused on academic purpose. For the academic purpose, the user may have searched for common courses among what they are actually learning in the class. For example, users borrow the practice English book because practice English book is a compulsory course in classroom. But the recommender should not represent all of English readings in the library but it recommends the related English in compulsory courses with popularity to adapt users' commonness. Another reason, there are not similar other commercial book selling, the library automation is no review data and no top rank selling in book-loan record. Even so the book-loan record can represent the interesting field of the user which can use this information to mapping the book category.

B. The Feature Selection of the Algorithms
Unlike the traditional collaborative filter [14][29] recommendation system, there is no scoring, popularity of item and user ratings data to calculate the preference towards related books. Thus, we proposed the method based on the book loan and multiple information to use as a feature for training in SVM classifier as follows.

1) Similarity measures for book title:
The similarity of book title was used to determine the similarity between the titles which is the important feature to classify the related book. Thus, there are a large number of methods proposed to measure the similarity of the title matching [15]. However, the complexity of the structure of the title have varies for example a title can be phrase, a word or sentence with vary length and most importantly the most of methods may not be suitable for Thai languages because our languages have special character with upper/lower vowel [16]. Thai language is written from left to right without spaces between characters. Each character has only one type that is no uppercase and lowercase of each character and combination with vowels to main consonant as shown in Fig. 2 and 3.  Based on characters to represent in Thai language, we observed and enquired the librarian about the problem of information searching of automation library in the school of Banpasao Chiang Mai. We found the two main problems of book searching related to upper/lower vowels. First, the user always misinformation to spell the upper or lower vowel in the keyword search for example, อยากให เรื ่ องนี ้ ไม มี โชคร าย (correct spell) vs อยากให เรื องนี ไมมี โชคร าย (miscorrect 3 upper voxel). In this case, the most similarity measures for title compute upper voxel as the main character to compute distance [17] which may not be accurate because the user is not aware the upper voxel is not important in Thai language and affects the meaning of the word in terms of computation. Second, the librarian cataloged the spell of wrong pronunciation in the title such as misspell upper and lower vowels, misspelling of transposition of word. In this case, we found misspell the word of title 43 from 2251 book title from the database record in school of Banpasao Chiang Mai. Thus, the title can always misspell pronunciation or user error input as follows. Several methods of title similarity have been tested on our database the result of compare for measure shown in Table I. In order to measure the book whose titles are similar to database, we modified the Damerau-Levenstein algorithm based on Levenstein's algorithm. We add the except deletion string to algorithms if the string equal upper or lower vowel in Thai language Unicode for more robust in term of common misspelling and suitable for user behavior in library automation system. Where +1 � ≠ � is the indicator function equal to 0 when = and equal to 1 otherwise.
The DDC for systematic arrangement combination of association rule mining: As we mentioned earlier, the user searched the books which they are actually learning in the class or related the course. Based on this knowledge, the recommendation system should suggest the books are related to the same catalogue. For that reason, we consider the catalogue of books to be an important feature to training in SVM. The DDC (Dewey decimal classification) number is the standard system for book classification systems used in Thai libraries both public and private schools. It used to organize and provide access to their book and other material collections in the library. Basically, there are the hierarchy of three levels of digit numbers representing the subject fields such as 400 for "Science" and 410 for "Linguistics". The first digit is the main field of the subject and the second consists of the sub-field. In our case, we decided to determine only the first digital for training in SVM because the only main subject field of book are flexible to recommend the book for example the user read "Improving your speaking English" the system may suggest the related field for languages such as Linguistics book or other language book the summaries of classification of DDC system shows in Table II. We summarise the number of books of classification based on DDC that have been loans in the database to combine association rule mining.
Unlikely traditional association rule mining [8] , the concept of association rule is user borrows n books, x_i ( i = 1,..,n) for once time, it call the set {x_1,....,x_n} a " transaction". For instance, when a user borrows three books, A, B, and C, at one time and the A, B, C must be the same as the first digit of DDC, this transaction can be represented as {A, B, C}. In these transactions, we can provide the rule "the user who borrows book A also borrows book B. In addition, the user who borrows books B and C at one time also borrows book A. However, we add the rule in transaction must be the same as the first digit of DDC because to be more accurate for the behavior of the user in the library recommendation system is more likely to borrow in the same category of subject fields.

2) Similarity measures for bibliographic information of book:
The Bibliographic book also benefits information when users prefer to find a similar book such as the book by the same author, nearly a year of publication, most viewed by other users. In addition, each bibliographic book has characteristics to extract the information. We extract the characteristics of a bibliographic for book recommendation.
a) Date published of book: The user tends to seek the book based on the date published of the book which they are interested in and borrowed. The system calculates the absolute value based on the user selected booked between the book record publication dates. Thus, if the value is more less it mean it would be better to recommend the book. b) Book based category or Author: Based on the investigation on the load data, the user tends to borrow the books of the same category or author which they borrowed before. Regarding to this knowledge, we calculated in the same way between logical 1 and 0 where 0 corresponds to matching with a same category or author and 1 corresponds to a no match. c) Number of views for the book: The feature of Number of views for the book was calculated based on the views of books of the total number of users who search and click the detail on OPAC in the library automation system.
In order to combine each feature of (a)(b) and (c) we formulated an equation.
Topic of book (book category). Author. .

C. Book Recommendation System of Library Automation by SVM
Based on literature review, [18] [19] used SVM as the machine learning method to book recommender systems as the effective tool. However, the effect of recommendation depends on the sources of information as the input to the learning process of SVM. As described before for each source of information, we implement the SVM learning which optimal parameters for classifying book into effective recommender system. We used three features as • Similarity Measures for Book Title.
• The DDC for systematic arrangement combination of Association Rule Mining.
• Similarity Measures for Bibliographic Information of book.
We implement the module of recommender system with LibSVM [20][21] version 3.24 on our open source automated library system. The libSVM supports vector classification with "C-Support Vector Classification". The system showed the four highest scores of the book which combine based on three features that we mention before and recommended the book arrangement in OPAC system. As we mentioned before, we developed an open source library automation name as Angkaew with implement the recommendation feature as in figure x can be download in Department of library and information science Faculty of humanities Chiang Mai university (http://lis.human.cmu.ac.th/). The book recommendation system was included below the OPAC feature. The example used the keyword of "English languages 193 | P a g e www.ijacsa.thesai.org in grade 5", the algorithms arrange the top four highest scores to show from highest to lower respectively the program shown in Fig. 4.

IV. EXPERIMENT RESULT
In order to experiment the effectiveness of book recommendation, two types of evaluation were employed in this research: qualitative and quantitative data.

A. Participate
Our participants were thirty students of lower-secondary and thirty-two upper-secondary in the school of Banpasao Chiang Mai which have the similar background and have experience to use OPAC service to search the book in library. The participants were registered in the system and have loan records in the library automation at least once per month.

B. Qualitative Data
To evaluate the book recommendation in terms of qualitative data, we asked the participants to describe their satisfaction based on level of interest in each book (during using the OPAC with implement the recommendation system) using the following five-point scale of interest which is similar to use by [8] [22]. 2: Very interested 1: Interested 0: Not interested x: book is not related in my mind A: Already borrowed or have read before. For each participant searched the book on OPAC for ten times the result shown as Table III.
Based on the satisfaction questionnaire, our recommendation system can persuade the book recommender of "Very interested" and "interested" by 14.5% and 22.5% respectively. To better understand, we also interview the students reported that they feel language, science and technology catalogue are the best recommendation book. However, the negative result are "Not interested" and "Book is not related in my mind" by 31.6% and 19.1% respectively. For this result, the students reported even the book is the same catalogue which they prefer but the titles are not related to their interests. Finally, the mark of "Already borrowed or have read before" of 12.0%. To investigate this result, the students reported that they read it before to use the system but there is no case of "already borrowed" in the same library automation system.

C. Qualitative Data Analysis
Data analysis of OPAC usage: We are also monitoring the user behavior based on usage of OPAC [23] activity by library. The circulation records and library autolib log-file were collected between July 2019 and December 2019 with support by the administrator of the school of Banpasao Chiang Mai. At the end of each month, the data were recorded from the Angkeaw autolib server. The aim of log-file was to data analytic of library collection usage for recommendation system in OPAC module.
Log-files were collected into a database in order to verify the effectiveness of our algorithms for recommendation system. The database of activity provides the information for user's last activity in OPAC module such as user using keyword to search in OPAC, click on book recommendation. However our database cannot analyze whether the user pay attention or impressive to the book recommendation on it. We compared between two groups of usage. Group of "search keyword on OPAC with click recommender" defined as users who use keyword to search in OPAC module and click on book recommendation system per time of using one keyword and the group of "search keyword on OPAC without click recommender" defined as users who use keyword to search in OPAC module without any click on recommendation system per time of using one keyword. Based on our statistic, the frequency of using OPAC is vary depending on month. To analyze data, it have highest on August (52 per day) and November (56 per day) due to midterm and Final exam. We represented the statistic show as activity per month. The statistic is shown in Fig. 5. 1) Experiment and data analysis with precision: In order to measure the accuracy of our algorithms, we employed the evaluation metrics of precision [24] and have been used for book recommendation [10]. The evaluation metrics of precision is defined as formula 1. The recommendation books predicted in this experiment refer to the book categories based on Dewey Decimal Classification. We evaluate the accuracy of algorithms by the average precision of each user which divides the performance of each category of Dewey Decimal Classification.

= ℎ
We collected experiment data by the monitoring system of the user from the total 431 users in Angkaew Autolib in July 2019 -December 2019. However, our algorithms compute based on DDC category for systematic arrangement of book recommendation. The reason why our algorithm computes and recommends only the same DDC category of user view is because users in the library always borrow the books of the same category unlike commercial book stores. Thus, we measured the precision of each DDC category listed in Table IV. The precision of each DDC category is shows in Table IV, where p denotes the precision of our algorithms which the average precision of algorithms is 0.164. In addition, the top three highest precision of DDC categories are Language (0.31), Science (0.27) and Technology (0.25).

V. CONCLUSIONS AND DISCUSSION
We proposed the module of Automation library using a book recommendation system by a support vector machine implemented with three features of information: similarity measures for book title, DDC for systematic arrangement of materials and similarity measures for bibliographic Information of book. Based on experiment result, our algorithm improved usage of user in library automation from both qualitative and quantitative data. For qualitative data based on satisfaction questionnaire, the results of our recommendation system show user feel very interested (14.5%) through usage of OPAC module. In addition, the result of recorded user behavior based on usage activity showed effective 1644(21.27%) from 7729 times using OPAC click view on book recommendation system. However, we designed the algorithms based on usage of student behavior with adapt on OPAC module in library automation. Our algorithm suitable for small library such as library in school , small organization and specialized library with book resource less than 10000 books and not too much data information of loan record.
An algorithm specifically designed based on user behavior using library automation using a support vector machine and specific feature selection of modification similarity measures of book title in Thai language and multiple bibliography book information. However, the limitation of algorithms fit for small dataset of book resource because due to computation time for whole bibliographic book for each time. Our future work will focus on improving the algorithms of Title similarity specifically on Thai language for the whole category of book with implementation on natural language processing to understand the nature of each title book in order to provide the personalized book.