Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 11, 2023.
Abstract: In recent times, Retrieval Augmented Generation (RAG) models have garnered considerable attention, primarily due to the impressive capabilities exhibited by Large Language Models (LLMs). Nevertheless, the Arabic language, despite its significance and widespread use, has received relatively less research emphasis in this field. A critical element within RAG systems is the Information Retrieval component, and at its core lies the vector embedding process commonly referred to as “semantic embedding”. This study encompasses an array of multilingual semantic embedding models, intending to enhance the model’s ability to comprehend and generate Arabic text effec-tively. We conducted an extensive evaluation of the performance of ten cutting-edge Multilingual Semantic embedding models, employing a publicly available ARCD dataset as a benchmark and assessing their performance using the average Recall@k metric. The results showed that the Microsoft E5 sentence embedding model outperformed all other models on the ARCD dataset, with Recall@10 exceeding 90%.
Hazem Abdelazim, Mohamed Tharwat and Ammar Mohamed, “Semantic Embeddings for Arabic Retrieval Augmented Generation (ARAG)” International Journal of Advanced Computer Science and Applications(IJACSA), 14(11), 2023. http://dx.doi.org/10.14569/IJACSA.2023.01411135
@article{Abdelazim2023,
title = {Semantic Embeddings for Arabic Retrieval Augmented Generation (ARAG)},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.01411135},
url = {http://dx.doi.org/10.14569/IJACSA.2023.01411135},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {11},
author = {Hazem Abdelazim and Mohamed Tharwat and Ammar Mohamed}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.