The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2023.0140687
PDF

Offensive Language Identification in Low Resource Languages using Bidirectional Long-Short-Term Memory Network

Author 1: Aigerim Toktarova
Author 2: Aktore Abushakhma
Author 3: Elvira Adylbekova
Author 4: Ainur Manapova
Author 5: Bolganay Kaldarova
Author 6: Yerzhan Atayev
Author 7: Bakhyt Kassenova
Author 8: Ainash Aidarkhanova

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 6, 2023.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Offensive language identification is a critical task in today's digital era, enabling the development of effective content moderation systems. However, it poses unique challenges in low resource languages where limited annotated data is available. This research paper focuses on addressing the problem of offensive language identification specifically in the context of a low resource language, namely the Kazakh language. To tackle this challenge, we propose a novel approach based on Bidirectional Long-Short-Term Memory (BiLSTM) networks, which have demonstrated strong performance in natural language processing tasks. By leveraging the bidirectional nature of the BiLSTM architecture, we capture both contextual dependencies and long-term dependencies in the input text, enabling more accurate offensive language identification. Our approach further utilizes transfer learning techniques to mitigate the scarcity of annotated data in the low resource setting. Through extensive experiments on a Kazakh offensive language dataset, we demonstrate the effectiveness of our proposed approach, achieving state-of-the-art results in offensive language identification in the low resource Kazakh language. Moreover, we analyze the impact of different model configurations and training strategies on the performance of our approach. The findings from our study provide valuable insights into offensive language identification techniques in low resource languages and pave the way for more robust content moderation systems tailored to specific linguistic contexts.

Keywords: Offensive language; natural language processing; low resource language; machine learning; deep learning; classification

Aigerim Toktarova, Aktore Abushakhma, Elvira Adylbekova, Ainur Manapova, Bolganay Kaldarova, Yerzhan Atayev, Bakhyt Kassenova and Ainash Aidarkhanova, “Offensive Language Identification in Low Resource Languages using Bidirectional Long-Short-Term Memory Network” International Journal of Advanced Computer Science and Applications(IJACSA), 14(6), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0140687

@article{Toktarova2023,
title = {Offensive Language Identification in Low Resource Languages using Bidirectional Long-Short-Term Memory Network},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0140687},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0140687},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {6},
author = {Aigerim Toktarova and Aktore Abushakhma and Elvira Adylbekova and Ainur Manapova and Bolganay Kaldarova and Yerzhan Atayev and Bakhyt Kassenova and Ainash Aidarkhanova}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org