Future of Information and Communication Conference (FICC) 2025
28-29 April 2025
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 10 Issue 6, 2019.
Abstract: Missing text prediction is one of the major concerns of Natural Language Processing deep learning community’s at-tention. However, the majority of text prediction related research is performed in other languages but not Arabic. In this paper, we take a first step in training a deep learning language model on Arabic language. Our contribution is the prediction of missing text from text documents while applying Convolutional Neural Networks (CNN) on Arabic Language Models. We have built CNN-based Language Models responding to specific settings in relation with Arabic language. We have prepared our dataset of a large quantity of text documents freely downloaded from Arab World Books, Hindawi foundation, and Shamela datasets. To calculate the accuracy of prediction, we have compared documents with complete text and same documents with missing text. We realized training, validation and test steps at three different stages aiming to increase the performance of prediction. The model had been trained at first stage on documents of the same author, then at the second stage, it had been trained on documents of the same dataset, and finally, at the third stage, the model had been trained on all document confused. Steps of training, validation and test have been repeated many times by changing each time the author, dataset, and the combination author-dataset, respectively. Also we have used the technique of enlarging training data by feeding the CNN-model each time by a larger quantity of text. The model gave a high performance of Arabic text prediction using Convolutional Neural Networks with an accuracy that have reached 97.8% in best case.
Adnan Souri, Mohamed Alachhab, Badr Eddine Elmohajir and Abdelali Zbakh, “Convolutional Neural Networks in Predicting Missing Text in Arabic” International Journal of Advanced Computer Science and Applications(IJACSA), 10(6), 2019. http://dx.doi.org/10.14569/IJACSA.2019.0100668
@article{Souri2019,
title = {Convolutional Neural Networks in Predicting Missing Text in Arabic},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2019.0100668},
url = {http://dx.doi.org/10.14569/IJACSA.2019.0100668},
year = {2019},
publisher = {The Science and Information Organization},
volume = {10},
number = {6},
author = {Adnan Souri and Mohamed Alachhab and Badr Eddine Elmohajir and Abdelali Zbakh}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.