Abstract: This paper presents a novel hybrid machine learning (ML)reduction approach to enhance cancer classification accuracy of microarray data based on two ML gene ranking techniques (T-test and Class Separability (CS)). The proposed approach is integrated with two ML classifiers; K-nearest neighbor (KNN) and support vector machine (SVM); for mining microarray gene expression profiles. Four public cancer microarray databases are used for evaluating the proposed approach and successfully accomplish the mining process. These are Lymphoma, Leukemia SRBCT, and Lung Cancer. The strategy to select genes only from the training samples and totally excluding the testing samples from the classifier building process is utilized for more accurate and validated results. Also, the computational experiments are illustrated in details and comprehensively presented with literature related results. The results showed that the proposed reduction approach reached promising results of the number of genes supplemented to the classifiers as well as the classification accuracy.
Keywords: Mining Microarray data; Cancer classification; SVM