Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 11 Issue 10, 2020.
Abstract: With the proliferation of social media and Internet accessibility, a massive amount of data has been produced. In most cases, the textual data available through the web comes mainly from people expressing their views in informal words. The Arabic language is one of the hardest Semitic languages to deal with because of its complex morphology. In this paper, a new contribution to the Arabic resources is presented as a large Moroccan dataset retrieved from Twitter and carefully annotated by native speakers. For the best of our knowledge, this dataset is the largest Moroccan dataset for sentiment analysis. It is distinguished by its size, its quality given by the commitment of annotators, and its accessibility for the research community. Furthermore, the MSTD (Moroccan Sentiment Twitter Dataset) is benchmarked through experiments carried out for 4-way classification as well as polarity classification (positive, negative). Various machine-learning algorithms are combined to feature extraction techniques to reach optimal settings. This work also presents the effect of stemming and lemmatization on the improvement of the obtained accuracies.
Soukaina MIHI, Brahim AIT BEN ALI, Ismail EL BAZI, Sara AREZKI and Nabil LAACHFOUBI, “MSTD: Moroccan Sentiment Twitter Dataset” International Journal of Advanced Computer Science and Applications(IJACSA), 11(10), 2020. http://dx.doi.org/10.14569/IJACSA.2020.0111045
@article{MIHI2020,
title = {MSTD: Moroccan Sentiment Twitter Dataset},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2020.0111045},
url = {http://dx.doi.org/10.14569/IJACSA.2020.0111045},
year = {2020},
publisher = {The Science and Information Organization},
volume = {11},
number = {10},
author = {Soukaina MIHI and Brahim AIT BEN ALI and Ismail EL BAZI and Sara AREZKI and Nabil LAACHFOUBI}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.