Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 11 Issue 1, 2020.
Abstract: The recent years have witnessed the development of numerous approaches to authorship attribution including statistical and linguistic methods. Stylometric authorship attribution, however, remains among the most widely used due to its accuracy and effectiveness. Nevertheless, many authorship problems remain unresolved in terms of Arabic. This can be attributed to different factors including linguistic peculiarities that are not usually considered in standard authorship systems. In the case of Arabic, the morphological features carry unique stylistic features that can be usefully used in testing authorship in controversial texts and writings. The hypothesis is that much of these morphological features are lost due to the execution of stemming. As such, this study is concerned with investigating the effectiveness of stemming in the stylometric applications to authorship attribution in Arabic. In so doing, three Arabic stemmers GOLD stemmer, Khoga stemmer, Light 10 stemmer are used. By way of illustration, a corpus of 2400 news articles written by different 97 authors is designed. To evaluate the effectiveness of stemming, the selected articles (both stemmed and unstemmed texts) are clustered using cluster analysis methods. Comparisons are made between clustering structures based on stemmed and unstemmed datasets. The results indicate that stemming has negative impacts on the accuracy of the clustering performance and thus on the reliability of stylometric authorship testing in Arabic. The peculiar stylistic features of the affixation processes in Arabic can, thus, be usefully used for improving the performance of authorship attribution applications in Arabic. It can be finally concluded that stemming is not effective in the stylometric authorship applications in Arabic.
Abdulfattah Omar and Wafya Ibrahim Hamouda, “The Effectiveness of Stemming in the Stylometric Authorship Attribution in Arabic” International Journal of Advanced Computer Science and Applications(IJACSA), 11(1), 2020. http://dx.doi.org/10.14569/IJACSA.2020.0110114
@article{Omar2020,
title = {The Effectiveness of Stemming in the Stylometric Authorship Attribution in Arabic},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2020.0110114},
url = {http://dx.doi.org/10.14569/IJACSA.2020.0110114},
year = {2020},
publisher = {The Science and Information Organization},
volume = {11},
number = {1},
author = {Abdulfattah Omar and Wafya Ibrahim Hamouda}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.