The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Metadata Harvesting (OAI2)
  • Digital Archiving Policy
  • Promote your Publication

IJACSA

  • About the Journal
  • Call for Papers
  • Author Guidelines
  • Fees/ APC
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Editors
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Indexing
  • Submit your Paper
  • Guidelines
  • Fees
  • Current Issue
  • Archives
  • Editors
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2013.040820
PDF

Investigate the Performance of Document Clustering Approach Based on Association Rules Mining

Author 1: Noha Negm
Author 2: Mohamed Amin
Author 3: Passent Elkafrawy
Author 4: Abdel Badeeh M. Salem

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 4 Issue 8, 2013.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: The challenges of the standard clustering methods and the weaknesses of Apriori algorithm in frequent termset clustering formulate the goal of our research. Based on Association Rules Mining, an efficient approach for Web Document Clustering (ARWDC) has been devised. An efficient Multi-Tire Hashing Frequent Termsets algorithm (MTHFT) has been used to improve the efficiency of mining association rules by targeting improvement in mining of frequent termset. Then, the documents are initially partitioned based on association rules. Since a document usually contains more than one frequent termset, the same document may appear in multiple initial partitions, i.e., initial partitions are overlapping. After making partitions disjoint, the documents are grouped within the partition using descriptive keywords, the resultant clusters are obtained effectively. In this paper, we have presented an extensive analysis of the ARWDC approach for different sizes of Reuters datasets. Furthermore the performance of our approach is evaluated with the help of evaluation measures such as, Precision, Recall and F-measure compared to the existing clustering algorithms like Bisecting K-means and FIHC. The experimental results show that the efficiency, scalability and accuracy of the ARWDC approach has been improved significantly for Reuters datasets.

Keywords: Web Document Clustering; Knowledge Discovery; Association Rules Mining; Frequent termsets; Apriori algorithm; Text Documents; Text Mining; Data Mining

Noha Negm, Mohamed Amin, Passent Elkafrawy and Abdel Badeeh M. Salem, “Investigate the Performance of Document Clustering Approach Based on Association Rules Mining” International Journal of Advanced Computer Science and Applications(IJACSA), 4(8), 2013. http://dx.doi.org/10.14569/IJACSA.2013.040820

@article{Negm2013,
title = {Investigate the Performance of Document Clustering Approach Based on Association Rules Mining},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2013.040820},
url = {http://dx.doi.org/10.14569/IJACSA.2013.040820},
year = {2013},
publisher = {The Science and Information Organization},
volume = {4},
number = {8},
author = {Noha Negm and Mohamed Amin and Passent Elkafrawy and Abdel Badeeh M. Salem}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

Future of Information and Communication Conference (FICC) 2024

4-5 April 2024

  • Berlin, Germany

Computing Conference 2024

11-12 July 2024

  • London, United Kingdom

IntelliSys 2024

5-6 September 2024

  • Amsterdam, The Netherlands

Future Technologies Conference (FTC) 2023

2-3 November 2023

  • San Francisco, United States
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org