The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2024.0150818
PDF

Diabetes Prediction Using Machine Learning with Feature Engineering and Hyperparameter Tuning

Author 1: Hakim El Massari
Author 2: Noreddine Gherabi
Author 3: Fatima Qanouni
Author 4: Sajida Mhammedi

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 15 Issue 8, 2024.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Diabetes, a chronic illness, has seen an increase in prevalence over the years, posing several health challenges. This study aims to predict diabetes onset using the Pima Indians Diabetes dataset. We implemented several machine learning algorithms, namely Random Forest, Gradient Boosting, XGBoost, LightGBM, and CatBoost. To enhance model performance, we applied a variety of feature engineering techniques, including SelectKBest, Recursive Feature Elimination (RFE), Recursive Feature Elimination with Cross-Validation (RFECV), Forward Feature Selection, and Backward Feature Elimination. RFECV proved to be the most effective method, leading to the selection of the best feature set. In addition, hyperparameter tuning techniques are used to determine the optimal parameters for the models created. Upon training these models with the optimized parameters, XGBoost outperformed the others with an accuracy of 94%, while Random Forest and CatBoost both achieved 92.5%. These results highlight XGBoost's superior predictive power and the significance of thorough feature engineering and model tuning in diabetes prediction.

Keywords: Machine learning; feature engineering; hyperparameter tuning; diabetes prediction; healthcare

Hakim El Massari, Noreddine Gherabi, Fatima Qanouni and Sajida Mhammedi, “Diabetes Prediction Using Machine Learning with Feature Engineering and Hyperparameter Tuning” International Journal of Advanced Computer Science and Applications(IJACSA), 15(8), 2024. http://dx.doi.org/10.14569/IJACSA.2024.0150818

@article{Massari2024,
title = {Diabetes Prediction Using Machine Learning with Feature Engineering and Hyperparameter Tuning},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2024.0150818},
url = {http://dx.doi.org/10.14569/IJACSA.2024.0150818},
year = {2024},
publisher = {The Science and Information Organization},
volume = {15},
number = {8},
author = {Hakim El Massari and Noreddine Gherabi and Fatima Qanouni and Sajida Mhammedi}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org