Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 12 Issue 8, 2021.
Abstract: A digital library is a type of information retrieval (IR) system. The existing IR methodologies generally have problems on keyword searching. Some of search engine has not been able to provide search results with partial matching and typographical error. Therefore, it is required to be able to provide search results that are relevant to keywords provided by the user. We proposed a model to solve the problem by combining the spell correction and query expansion. Searching is starting with indexing the title of the document by preprocessing the title of all incoming document data and then weighting the Term Frequency – Inverse Document Frequency (TF-IDF) against all terms of the whole document. Levenshtein Distance algorithm is used in the search process to correct typo-indicated keywords. Before calculating the relevance between the keywords and the documents using Cosine Similarity, the keywords are expanded using Query Expansion to increase number of documents retrieved. Calculation results using Cosine Similarity are then added to Query Expansion weight calculation to get final ranking result. Results show improvements over IR system compared with system without spell check and query expansion. The results of the study in the form of web-based application conducted testing for 50 times with number of data of 2,045. The system was able to correct typo-indicated keywords and search documents with average recall value of 95.91%, average precision value of 63.82% and average Non Interpolated Average Precision (NIAP) value of 86.29%.
Dewi Soyusiawaty and Denny Hilmawan Rahmatullah Wolley, “Hybrid Spelling Correction and Query Expansion for Relevance Document Searching” International Journal of Advanced Computer Science and Applications(IJACSA), 12(8), 2021. http://dx.doi.org/10.14569/IJACSA.2021.0120838
@article{Soyusiawaty2021,
title = {Hybrid Spelling Correction and Query Expansion for Relevance Document Searching},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2021.0120838},
url = {http://dx.doi.org/10.14569/IJACSA.2021.0120838},
year = {2021},
publisher = {The Science and Information Organization},
volume = {12},
number = {8},
author = {Dewi Soyusiawaty and Denny Hilmawan Rahmatullah Wolley}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.