Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 10, 2023.
Abstract: Text Simplification (TS) is an emerging field in Natural Language Processing (NLP) that aims to make complex text more accessible. However, there is limited research on TS in the Malay language, known as Bahasa Malaysia, which is widely spoken in Southeast Asia. The challenges in this domain revolve around data availability, feature engineering, and the suitability of methods for text simplification. Previous studies predominantly employed single methods such as semantic compression, or machine learning with the Support Vector Machine (SVM) classifier consistently achieving an accuracy of approximately 70% in identifying troll sentences—statements containing threats from online trolls notorious for their disruptive online behavior. This study combines semantic compression and machine learning methods across lexical, syntactic, and semantic levels, utilizing frequency dictionaries as semantic features. Support Vector Machine and Decision Tree classifiers are applied and tested on 6,836 datasets, divided into training and testing sets. When comparing SVM and Decision Tree with and without semantic features, SVM with semantics achieves an average accuracy of 92.37%, while Decision Tree with semantics reaches 91.21%. The proposed TS method is evaluated on troll sentences, which are often associated with cyberbullying. Furthermore, it is worth noting that cyberbullying has been reported to be a significant issue, with Malaysia ranking as the second worst out of the 28 countries surveyed in Asia. Therefore, the outcomes of the study could potentially offer means, such as machine translation and relation extraction, to help prevent cyberbullying in Malaysia.
Juhaida Abu Bakar, Nooraini Yusoff, Nor Hazlyna Harun, Maslinda Mohd Nadzir and Salehah Omar, “Text Simplification using Hybrid Semantic Compression and Support Vector Machine for Troll Threat Sentences” International Journal of Advanced Computer Science and Applications(IJACSA), 14(10), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0141035
@article{Bakar2023,
title = {Text Simplification using Hybrid Semantic Compression and Support Vector Machine for Troll Threat Sentences},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0141035},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0141035},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {10},
author = {Juhaida Abu Bakar and Nooraini Yusoff and Nor Hazlyna Harun and Maslinda Mohd Nadzir and Salehah Omar}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.