The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2023.0140332
PDF

An Automated Text Document Classification Framework using BERT

Author 1: Momna Ali Shah
Author 2: Muhammad Javed Iqbal
Author 3: Neelum Noreen
Author 4: Iftikhar Ahmed

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 3, 2023.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Due to the rapid advancement of technology, the volume of online text data from numerous various disciplines is increasing significantly over time. Therefore, more work is needed to create systems that can effectively classify text data in accordance with its content, facilitating processing and the extraction of crucial information. Since these non-automated systems use manual feature extraction and classification, which is error-prone and time-consuming by choosing the best appropriate algorithms for feature extraction and classification, traditional procedures are typically resource intensive (computational, human, etc.), which is not a viable solution. To address the shortcomings of traditional approaches, we offer a unique text categorization strategy based on a well-known DL algorithm called BERT. The proposed framework is trained and tested using cutting-edge text datasets, such as the UCI email dataset, which includes spam and non-spam emails, and the BBC News dataset, which includes multiple categories such as tech, sports, politics, business, and entertainment. The system achieved the highest accuracy of 91.4% and can be used by different organizations to classify text-based data with a high performance. The effectiveness of the proposed framework is evaluated using multiple evaluation metrics such as Accuracy, Precision, and Recall.

Keywords: Deep learning; text classification; BERT

Momna Ali Shah, Muhammad Javed Iqbal, Neelum Noreen and Iftikhar Ahmed, “An Automated Text Document Classification Framework using BERT” International Journal of Advanced Computer Science and Applications(IJACSA), 14(3), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0140332

@article{Shah2023,
title = {An Automated Text Document Classification Framework using BERT},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0140332},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0140332},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {3},
author = {Momna Ali Shah and Muhammad Javed Iqbal and Neelum Noreen and Iftikhar Ahmed}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org