Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 17 Issue 5, 2026.
Abstract: Pancreatic cancer remains a leading cause of cancer-related mortality due to its asymptomatic progression and late-stage diagnosis. Early detection is critical for improving patient prognosis and clinical outcomes. Traditional diagnostic approaches and previous computational models often struggle with molecular heterogeneity and technical variations across different genomic platforms. These batch effects limit the reliability and generalizability of predictive biomarkers when applied to diverse clinical settings. This research proposes a robust machine learning framework designed for platform-invariant pancreatic cancer prediction. Large-scale transcriptomic datasets, including microarray data from the Gene Expression Omnibus (GEO) and RNA-seq data from The Cancer Genome Atlas (TCGA), were integrated. Subsequently, the ComBat algorithm was applied to correct batch effects. This resulted in a discovery cohort of 441 samples and an external validation set of 409 samples. An optimized XGBoost classifier was developed through comparative benchmarking. It was compared against several learners, including Random Forest, LightGBM, Support Vector Machines (SVM), and Logistic Regression. The model demonstrated high predictive performance, achieving an internal test AUC of 0.923. External validation was performed across six independent cohorts, yielding a mean AUC of 0.761 ± 0.090 (95% CI: 0.689–0.833). These findings support the robustness and cross-platform generalizability of the proposed framework. To enhance model interpretability, SHapley Additive exPlanations (SHAP) analysis was employed to identify key molecular drivers. These drivers were further validated using biological enrichment analysis through Over-Representation Analysis (ORA) and log2FC-weighted Gene Set Enrichment Analysis (GSEA). The proposed framework provides a reliable and scalable solution for multi-platform integration. This approach facilitates accurate risk stratification and precision oncology in clinical practice.
Nada Ahmed El-Gammal, Rania Ahmed Abdel Azeem Abul Seoud and Sayed T. Muhammad. “An Explainable XGBoost-Based Framework for Robust Multi-Cohort Prediction of Pancreatic Cancer”. International Journal of Advanced Computer Science and Applications (IJACSA) 17.5 (2026). http://dx.doi.org/10.14569/IJACSA.2026.0170547
@article{El-Gammal2026,
title = {An Explainable XGBoost-Based Framework for Robust Multi-Cohort Prediction of Pancreatic Cancer},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2026.0170547},
url = {http://dx.doi.org/10.14569/IJACSA.2026.0170547},
year = {2026},
publisher = {The Science and Information Organization},
volume = {17},
number = {5},
author = {Nada Ahmed El-Gammal and Rania Ahmed Abdel Azeem Abul Seoud and Sayed T. Muhammad}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.