Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 13 Issue 12, 2022.
Abstract: Cross-lingual summarization (CLS) is a process of generating a summary in the target language from a source document in another language. CLS is a challenging task because it involves two different languages. Traditionally, CLS is carried out in a pipeline scheme that involves two steps: summarization and translation. This approach has a problem, it introduces error propagation. To address this problem, we present a novel end-to-end abstractive CLS without the explicit use of machine translation. The CLS architecture is based on Transformer which is proven to be able to perform text generation well. The CLS model is a jointly trained CLS task and monolingual summarization (MS) task. This is accomplished by adding a second decoder to handle the MS task, while the first decoder handles the CLS task. We also incorporated multilingual word embeddings (MWE) components into the architecture to further improve the performance of the CLS models. Both English and Bahasa Indonesia are represented by MWE whose embeddings have already been mapped into the same vector space. MWE helps to better map the relation between input and output that use different languages. Experiments show that the proposed model achieves improvement up to +0.2981 ROUGE-1, +0.2084 ROUGE-2, and +0.2771 ROUGE-L when compared to the pipeline baselines and up to +0.1288 ROUGE-1, +0.1185 ROUGE-2, and +0.1413 ROUGE-L when compared to the end-to-end baselines.
Achmad F. Abka, Kurniawati Azizah and Wisnu Jatmiko, “Transformer-based Cross-Lingual Summarization using Multilingual Word Embeddings for English - Bahasa Indonesia” International Journal of Advanced Computer Science and Applications(IJACSA), 13(12), 2022. http://dx.doi.org/10.14569/IJACSA.2022.0131276
@article{Abka2022,
title = {Transformer-based Cross-Lingual Summarization using Multilingual Word Embeddings for English - Bahasa Indonesia},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2022.0131276},
url = {http://dx.doi.org/10.14569/IJACSA.2022.0131276},
year = {2022},
publisher = {The Science and Information Organization},
volume = {13},
number = {12},
author = {Achmad F. Abka and Kurniawati Azizah and Wisnu Jatmiko}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.