Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 12 Issue 11, 2021.
Abstract: Usage of code-mixed text has increased in re-cent years among Indonesian internet users, who often mix Indonesian-language with English-language text. Normalisation of this code-mixed text into Indonesian needs to be performed to capture the meaning of English parts of the text and process them effectively. We improve a state-of-the-art code-mixed Indonesian-English normalisation system by modifying its pipeline modules. We further analyse the effect of code-mixed normalisation on emotion classification tasks. Our approach significantly improved on a state-of-the-art Indonesian-English code-mixed text normal-isation system in both the individual pipeline modules and the overall system. The new feature set in the language identification module showed an improvement of 4.26% in terms of F1 score. The combination of machine translation and ruleset in the lexical normalisation module improved BLEU score by 25.22% and lowered WER by 62.49%. The use of context in the translation module improved BLEU score by 2.5% and lowered WER by 8.84%. The effectiveness of the overall pipeline normalisation system increased by 32.11% and 33.82%, in terms of BLEU score and WER, respectively. Code-mixed normalisation also improved the accuracy of emotion classification by up to 37.74% in terms of F1 score.
Evi Yulianti, Ajmal Kurnia, Mirna Adriani and Yoppy Setyo Duto, “Normalisation of Indonesian-English Code-Mixed Text and its Effect on Emotion Classification” International Journal of Advanced Computer Science and Applications(IJACSA), 12(11), 2021. http://dx.doi.org/10.14569/IJACSA.2021.0121177
@article{Yulianti2021,
title = {Normalisation of Indonesian-English Code-Mixed Text and its Effect on Emotion Classification},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2021.0121177},
url = {http://dx.doi.org/10.14569/IJACSA.2021.0121177},
year = {2021},
publisher = {The Science and Information Organization},
volume = {12},
number = {11},
author = {Evi Yulianti and Ajmal Kurnia and Mirna Adriani and Yoppy Setyo Duto}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.