An Evaluation of Automatic Text Summarization of News Articles: The Case of Three Online Arabic Text Summary Generators

Fahad M. Alliheibi; Abdulfattah Omar; Nasser Al-Horais

doi:10.14569/IJACSA.2021.0120513

DOI: 10.14569/IJACSA.2021.0120513

PDF

An Evaluation of Automatic Text Summarization of News Articles: The Case of Three Online Arabic Text Summary Generators

Author 1: Fahad M. Alliheibi

Author 2: Abdulfattah Omar

Author 3: Nasser Al-Horais

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 12 Issue 5, 2021.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: Digital news platforms and online newspapers have multiplied at an unprecedented speed, making it difficult for users to read and follow all news articles on important, relevant topics. Numerous automatic text summarization systems have thus been developed to address the increasing needs of users around the world for summaries that reduce reading and processing time. Various automatic summarization systems have been developed and/or adapted in Arabic. The evaluation of automatic summarization performance is as important as the summarization process itself. Despite the importance of assessing summarization systems to identify potential limitations and improve their performance, very little has been done in this respect on systems in Arabic. Therefore, this study evaluated three text summarizers AlSummarizer, LAKHASLY, and RESOOMER using a corpus built of 40 news articles. Only articles written in Modern Standard Arabic (MSA) were selected as this is the formal and working language of Arab newspapers and news networks. Three expert examiners generated manual summaries and examined the linguistic consistency and relevance of the automatic summaries to the original news articles by comparing the automatic summaries to the manual (human) summaries. The scores for the three automatic summarizers were very similar and indicated that their performance was not satisfactory. In particular, the automatic summaries had serious problems with sentence relevance, which has negative implications for the reliability of such systems. The poor performance of Arabic summarizers can mainly be attributed to the unique morphological and syntactic characteristics of Arabic, which differ in many ways from English and other Western languages (the original language/s of automatic summarizers), and are critical in building sentence relevance and coherence in Arabic. Thus, summarization systems should be trained to identify discourse markers within the texts and use these in the generation of automatic summaries. This will have a positive impact on the quality and reliability of text summarization systems. Arabic summarization systems need to incorporate semantic approaches to improve performance and construct more coherent and meaningful summaries. This study was limited to news articles in MSA. However, the findings of the study and their implications can be extended to other genres, including academic articles.

Keywords: AlSummarizer; Arabic; automatic summarization; discourse markers; extraction; LAKHASLY; news articles; RESOOMER; sentence relevance

Fahad M. Alliheibi, Abdulfattah Omar and Nasser Al-Horais, “An Evaluation of Automatic Text Summarization of News Articles: The Case of Three Online Arabic Text Summary Generators” International Journal of Advanced Computer Science and Applications(IJACSA), 12(5), 2021. http://dx.doi.org/10.14569/IJACSA.2021.0120513

@article{Alliheibi2021,
title = {An Evaluation of Automatic Text Summarization of News Articles: The Case of Three Online Arabic Text Summary Generators},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2021.0120513},
url = {http://dx.doi.org/10.14569/IJACSA.2021.0120513},
year = {2021},
publisher = {The Science and Information Organization},
volume = {12},
number = {5},
author = {Fahad M. Alliheibi and Abdulfattah Omar and Nasser Al-Horais}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

An Evaluation of Automatic Text Summarization of News Articles: The Case of Three Online Arabic Text Summary Generators

Upcoming Conferences