The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Metadata Harvesting (OAI2)
  • Digital Archiving Policy
  • Promote your Publication

IJACSA

  • About the Journal
  • Call for Papers
  • Author Guidelines
  • Fees/ APC
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Editors
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Guidelines
  • Fees
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Editors
  • Reviewers
  • Subscribe

Article Details

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Improving the Diabetes Diagnosis Prediction Rate Using Data Preprocessing, Data Augmentation and Recursive Feature Elimination Method

Author 1: E. Sabitha
Author 2: M. Durgadevi

Download PDF

Digital Object Identifier (DOI) : 10.14569/IJACSA.2022.01309107

Article Published in International Journal of Advanced Computer Science and Applications(IJACSA), Volume 13 Issue 9, 2022.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Hyperglycemia is a symptom of diabetes mellitus, a metabolic condition brought on by the body's inability to produce enough insulin and respond to it. Diabetes can damage body organs if it is not adequately managed or detected in a timely manner. Many years of research into diabetes diagnosis has led to a suitable method for diabetes prediction. However, there is still scope for improvement regarding precision. The paper's primary objective is to emphasize the value of data preprocessing, feature selection, and data augmentation in disease prediction. Techniques for data preprocessing, feature selection, and data augmentation can assist classification algorithms function more effectively in the diagnosis and prediction of diabetes. A proposed method is employed for diabetes diagnosis and prediction using the PIMA Indian dataset. A systematic framework for conducting a comparison analysis based on the effectiveness of a three-category categorization model is provided in this study. The first category compares the model's performance with and without data preprocessing. The second category compares the performance of five alternative algorithms employing the Recursive Feature Elimination (RFE) feature selection method. Data augmentation is the third category; data augmentation is done with SMOTE Oversampling, and comparisons are made with and without SMOTE Oversampling. On the PIMA Indian Diabetes dataset, studies showed that data preprocessing, RFE with Random Forest Regression feature selection, and SMOTE Oversampling augmentation can produce accuracy scores of 81.25% with RF, 81.16 with DT, and 82.5% with SVC. From Six Classifiers LR, RF, DT, SVC, GNB and KNN, it is observed that RF, DT, and SVC performed better in accuracy level. The comparative study enables us to comprehend the value of data preprocessing, feature selection, and data augmentation in the disease prediction process as well as how they affect performance.

Keywords: Artificial Intelligence (AI); Machine Learning (ML); Deep Learning(DL); Neural Network; Diabetes Mellitus; Recursive Feature Elimination (RFE); Synthetic Minority Over-sampling Technique (SMOTE)

E. Sabitha and M. Durgadevi, “Improving the Diabetes Diagnosis Prediction Rate Using Data Preprocessing, Data Augmentation and Recursive Feature Elimination Method” International Journal of Advanced Computer Science and Applications(IJACSA), 13(9), 2022. http://dx.doi.org/10.14569/IJACSA.2022.01309107

@article{Sabitha2022,
title = {Improving the Diabetes Diagnosis Prediction Rate Using Data Preprocessing, Data Augmentation and Recursive Feature Elimination Method},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2022.01309107},
url = {http://dx.doi.org/10.14569/IJACSA.2022.01309107},
year = {2022},
publisher = {The Science and Information Organization},
volume = {13},
number = {9},
author = {E. Sabitha and M. Durgadevi}
}


IJACSA

Upcoming Conferences

Future of Information and Communication Conference (FICC) 2023

2-3 March 2023

  • Virtual

Computing Conference 2023

22-23 June 2023

  • London, United Kingdom

IntelliSys 2023

7-8 September 2023

  • Amsterdam, The Netherlands

Future Technologies Conference (FTC) 2023

2-3 November 2023

  • San Francisco, United States
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. Registered in England and Wales. Company Number 8933205. All rights reserved. thesai.org