R-Diffset vs. IR-Diffset: Comparison Analysis in Dense and Sparse Data

Julaily Aida Jusoh; Sharifah Zulaikha Tengku Hassan; Wan Aezwani Wan Abu Bakar; Syarilla Iryani Ahmad Saany; Mohd Khalid Awang; Norlina Udin @ Kamaruddin

doi:10.14569/IJACSA.2023.0140241

DOI: 10.14569/IJACSA.2023.0140241

PDF

R-Diffset vs. IR-Diffset: Comparison Analysis in Dense and Sparse Data

Author 1: Julaily Aida Jusoh

Author 2: Sharifah Zulaikha Tengku Hassan

Author 3: Wan Aezwani Wan Abu Bakar

Author 4: Syarilla Iryani Ahmad Saany

Author 5: Mohd Khalid Awang

Author 6: Norlina Udin @ Kamaruddin

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 2, 2023.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: The mining of concealed information from databases using Association Rule Mining seems to be promising. The successful extraction of this information will give a hand to many areas by aiding them in the process of finding solutions, economic projecting, commercialization policies, medical inspections, and numbers of other problems. ARM is the most outstanding method in the mining of remarkable related configurations from any groups of information. The important patterns encountered are categorized as recurrent/frequent and non-recurrent/infrequent. Most of the previous data mining methods concentrated on horizontal data set-ups. Nevertheless, recent studies have shown that vertical data formats are becoming the main concerns. One example of vertical data format is Rare Equivalence Class Transformation (R-Eclat). Due to its efficacy, R-Eclat algorithms have been commonly applied for the processing of large datasets. The R-Eclat algorithm is actually comprised of four types of variants. However, our work will only focus on the R-Diffset variant and Incremental R-Diffset (IR-Diffset). The performance analysis of the R-Diffset and IR-Diffset algorithms in the mining of sparse and dense data are compared. The processing time for R-Diffset algorithm, especially for sequential processing is very long. Thus, the incremental R-Diffset (IR-Diffset) has been established to solve this problem. While R-Diffset may only process the non-recurrent itemsets mining process in sequential form, IR-Diffset on the other hand has the capability to execute sequential data that have been fractionated. The advantages of this newly developed IR-Diffset may become a potential candidate in providing a time-efficient data mining process, especially those involving the large sets of data.

Keywords: R-Diffset; IR-Diffset; dense data; sparse data; comparison analysis

Julaily Aida Jusoh, Sharifah Zulaikha Tengku Hassan, Wan Aezwani Wan Abu Bakar, Syarilla Iryani Ahmad Saany, Mohd Khalid Awang and Norlina Udin @ Kamaruddin, “R-Diffset vs. IR-Diffset: Comparison Analysis in Dense and Sparse Data” International Journal of Advanced Computer Science and Applications(IJACSA), 14(2), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0140241

@article{Jusoh2023,
title = {R-Diffset vs. IR-Diffset: Comparison Analysis in Dense and Sparse Data},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0140241},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0140241},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {2},
author = {Julaily Aida Jusoh and Sharifah Zulaikha Tengku Hassan and Wan Aezwani Wan Abu Bakar and Syarilla Iryani Ahmad Saany and Mohd Khalid Awang and Norlina Udin @ Kamaruddin}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

R-Diffset vs. IR-Diffset: Comparison Analysis in Dense and Sparse Data

Upcoming Conferences