The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Outstanding Reviewers

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • ICONS_BA 2025

Computer Vision Conference (CVC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • RSS Feed

DOI: 10.14569/IJACSA.2025.0161296
PDF

Evaluating CTGAN-Generated Synthetic Data for Heart Disease Prediction: Fidelity, Predictive Utility, and Feature Preservation

Author 1: Wan Aezwani Wan Abu Bakar
Author 2: Nur Laila Najwa Josdi
Author 3: Mustafa Man
Author 4: Evizal Abdul Kadir

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 16 Issue 12, 2025.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: The increasing scarcity and sensitivity of clinical data necessitate the development of high-quality synthetic datasets. This study evaluated the ability of Conditional Tabular GAN (CTGAN) to generate synthetic heart disease data that preserves the statistical properties and predictive patterns of the Cleveland Heart Disease dataset. It assessed the fidelity of numerical and categorical features, preservation of pairwise correlations, and predictive utility using Logistic Regression and Random Forest classifiers. Dimensionality reduction analysis using PCA and t-SNE further measured the global similarity between the real and synthetic datasets. The results obtained show that CTGAN successfully reproduces the general distribution and correlations, especially for key features such as age, talach, and old peak. However, some discrepancies remain in categorical attributes. Predictive modeling shows moderate transferability, indicating that synthetic data captures important patterns without completely replicating the original labels. These findings highlight the potential of CTGAN-generated synthetic data as a privacy-preserving alternative for benchmarking and early algorithm development, while emphasizing the importance of feature-level and prediction validation in synthetic data research.

Keywords: Conditional Tabular GAN (CTGAN); correlation analysis; dimensionality reduction; feature importance; heart disease prediction; predictive utility; synthetic data; tabular data fidelity

Wan Aezwani Wan Abu Bakar, Nur Laila Najwa Josdi, Mustafa Man and Evizal Abdul Kadir. “Evaluating CTGAN-Generated Synthetic Data for Heart Disease Prediction: Fidelity, Predictive Utility, and Feature Preservation”. International Journal of Advanced Computer Science and Applications (IJACSA) 16.12 (2025). http://dx.doi.org/10.14569/IJACSA.2025.0161296

@article{Bakar2025,
title = {Evaluating CTGAN-Generated Synthetic Data for Heart Disease Prediction: Fidelity, Predictive Utility, and Feature Preservation},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2025.0161296},
url = {http://dx.doi.org/10.14569/IJACSA.2025.0161296},
year = {2025},
publisher = {The Science and Information Organization},
volume = {16},
number = {12},
author = {Wan Aezwani Wan Abu Bakar and Nur Laila Najwa Josdi and Mustafa Man and Evizal Abdul Kadir}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

Computer Vision Conference (CVC) 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

Artificial Intelligence Conference 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Future Technologies Conference (FTC) 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computer Vision Conference
  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

The Science and Information (SAI) Organization Limited is a company registered in England and Wales under Company Number 8933205.