Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 16 Issue 8, 2025.
Abstract: Accurate sentiment analysis in Arabic natural language processing (NLP) remains a complex task due to the language’s rich morphology, syntactic variability, and diverse dialects. Traditional annotation approaches require human experts, face significant challenges related to inter-annotator agreement and dialectal understanding. Recent advances in transformer-based models and large language models (LLMs) offer new techniques to generate annotations. This paper presents a comparative evaluation of three sentiment annotation strategies applied to Saudi dialect tweets: human expert labeling, fine-tuned transformer models (specifically CAMeLBERT-DA), and zero-shot inference using GPT-4o. The selected CAMeLBERT-DA which is already trained specifically for Arabic sentiment tasks and dialects, demonstrates robust performance with fast, scalable predictions. On the other hand, the selected GPT-4o shows competitive zero-shot accuracy without fine-tuning, making it a practical solution for real-time applications. We investigate how each approach performs on two datasets, both of more than 4,000 Saudi tweets covering a wide spectrum of dialects and sentiment expressions. Our methodology involves analyzing consistency across annotations using interrater agreement metrics such as Cohen’s Kappa, Pearson correlation, and class-specific agreement rates. The results reveal that while human annotations capture cultural and context subtleties, they suffer from inconsistency, particularly in ambiguous or dialect-specific cases. This study contributes to the growing body of work on annotation methodologies by highlighting the strengths and limitations of both human and AI-based annotators in Arabic NLP. Our findings suggest that the zero-shot use of domain-specific transformers like CAMeLBERT-DA with general-purpose LLMs such as GPT-4o have a moderate correlation compared to actual human annotators. The paper concludes with recommendations for building reliable ground truth datasets and integrating AI-assisted labeling into Arabic NLP tasks.
Dimah Alahmadi. “Human Versus AI: A Comparative Study of Zero-Shot LLMs and Transformer Models Against Human Annotations for Arabic Sentiment Analysis”. International Journal of Advanced Computer Science and Applications (IJACSA) 16.8 (2025). http://dx.doi.org/10.14569/IJACSA.2025.0160882
@article{Alahmadi2025,
title = {Human Versus AI: A Comparative Study of Zero-Shot LLMs and Transformer Models Against Human Annotations for Arabic Sentiment Analysis},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2025.0160882},
url = {http://dx.doi.org/10.14569/IJACSA.2025.0160882},
year = {2025},
publisher = {The Science and Information Organization},
volume = {16},
number = {8},
author = {Dimah Alahmadi}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.