Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 16 Issue 12, 2025.
Abstract: The increasing scarcity and sensitivity of clinical data necessitate the development of high-quality synthetic datasets. This study evaluated the ability of Conditional Tabular GAN (CTGAN) to generate synthetic heart disease data that preserves the statistical properties and predictive patterns of the Cleveland Heart Disease dataset. It assessed the fidelity of numerical and categorical features, preservation of pairwise correlations, and predictive utility using Logistic Regression and Random Forest classifiers. Dimensionality reduction analysis using PCA and t-SNE further measured the global similarity between the real and synthetic datasets. The results obtained show that CTGAN successfully reproduces the general distribution and correlations, especially for key features such as age, talach, and old peak. However, some discrepancies remain in categorical attributes. Predictive modeling shows moderate transferability, indicating that synthetic data captures important patterns without completely replicating the original labels. These findings highlight the potential of CTGAN-generated synthetic data as a privacy-preserving alternative for benchmarking and early algorithm development, while emphasizing the importance of feature-level and prediction validation in synthetic data research.
Wan Aezwani Wan Abu Bakar, Nur Laila Najwa Josdi, Mustafa Man and Evizal Abdul Kadir. “Evaluating CTGAN-Generated Synthetic Data for Heart Disease Prediction: Fidelity, Predictive Utility, and Feature Preservation”. International Journal of Advanced Computer Science and Applications (IJACSA) 16.12 (2025). http://dx.doi.org/10.14569/IJACSA.2025.0161296
@article{Bakar2025,
title = {Evaluating CTGAN-Generated Synthetic Data for Heart Disease Prediction: Fidelity, Predictive Utility, and Feature Preservation},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2025.0161296},
url = {http://dx.doi.org/10.14569/IJACSA.2025.0161296},
year = {2025},
publisher = {The Science and Information Organization},
volume = {16},
number = {12},
author = {Wan Aezwani Wan Abu Bakar and Nur Laila Najwa Josdi and Mustafa Man and Evizal Abdul Kadir}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.