The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2021.0120110
PDF

Predicting the Depression of the South Korean Elderly using SMOTE and an Imbalanced Binary Dataset

Author 1: Haewon Byeon

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 12 Issue 1, 2021.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Since the number of healthy people is much more than that of ill people, it is highly likely that the problem of imbalanced data will occur when predicting the depression of the elderly living in the community using big data. When raw data are directly analyzed without using supplementary techniques such as a sample algorithm for datasets, which have imbalanced class ratios, it can decrease the performance of machine learning by causing prediction errors in the analysis process. Therefore, it is necessary to use a data sampling technique for overcoming this imbalanced data issue. As a result, this study tried to identify an effective way for processing imbalanced data to develop ensemble-based machine learning by comparing the performance of sampling methods using the depression data of the elderly living in South Korean communities, which had quite imbalanced class ratios. This study developed a model for predicting the depression of the elderly living in the community using a logistic regression model, gradient boosting machine (GBM), and random forest, and compared the accuracy, sensitivity, and specificity of them to evaluate the prediction performance of them. This study analyzed 4,085 elderly people (≥60 years old) living in the community. The depression data of the elderly in the community used in this study had an unbalance issue: the result of the depression screening test showed that 87.5% of subjects did not have depression, while 12.5% of them had depression. This study used oversampling, undersampling, and SMOTE methods to overcome the unbalance problem of the binary dataset, and the prediction performance (accuracy, sensitivity, and specificity) of each sampling method was compared. The results of this study confirmed that the SMOTE-based random forest algorithm showing the highest accuracy (a sensitivity ≥ 0.6 and a specificity ≥ 0.6) was best prediction performance among random forest, GBM, and logistic regression analysis. Further studies are needed to compare the accuracy of SMOTE, undersampling, and oversampling for imbalanced data with high dimensional y-variables.

Keywords: Random forests; gradient boosting machine; SMOTE; undersampling; imbalanced data; oversampling

Haewon Byeon, “Predicting the Depression of the South Korean Elderly using SMOTE and an Imbalanced Binary Dataset” International Journal of Advanced Computer Science and Applications(IJACSA), 12(1), 2021. http://dx.doi.org/10.14569/IJACSA.2021.0120110

@article{Byeon2021,
title = {Predicting the Depression of the South Korean Elderly using SMOTE and an Imbalanced Binary Dataset},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2021.0120110},
url = {http://dx.doi.org/10.14569/IJACSA.2021.0120110},
year = {2021},
publisher = {The Science and Information Organization},
volume = {12},
number = {1},
author = {Haewon Byeon}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org