Future of Information and Communication Conference (FICC) 2025
28-29 April 2025
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 2, 2023.
Abstract: Image captioning task is highly used in many real-world applications. The captioning task is concerned with understanding the image using computer vision methods. Then, natural language processing methods are used to produce a description for the image. Different approaches were proposed to solve this task, and deep learning attention-based models have been proven to be the state-of-the-art. A survey on attention-based models for image captioning is presented in this paper including new categories that were not included in other survey papers. The attention-based approaches are classified into four main categories, further classified into subcategories. All categories and subcategories of the attention-based approaches are discussed in detail. Furthermore, the state-of-the-art approaches are compared and the accuracy improvements are stated especially in the transformer-based models, and a summary of the benchmark datasets and the main performance metrics is presented.
Asmaa A. E. Osman, Mohamed A. Wahby Shalaby, Mona M. Soliman and Khaled M. Elsayed, “A Survey on Attention-Based Models for Image Captioning” International Journal of Advanced Computer Science and Applications(IJACSA), 14(2), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0140249
@article{Osman2023,
title = {A Survey on Attention-Based Models for Image Captioning},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0140249},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0140249},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {2},
author = {Asmaa A. E. Osman and Mohamed A. Wahby Shalaby and Mona M. Soliman and Khaled M. Elsayed}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.