期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2013
卷号:47
期号:3
出版社:Journal of Theoretical and Applied
摘要:Gene expression data comprises a huge number of genes but have only few samples that can be used to address supervised classification problems. This paper is aimed at identifying a small set of genes, to efficiently distinguish various types of biological sample; hence we have proposed a three-stage of gene selection algorithm for genomic data. The proposed approach combines ReliefF, mRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm) coded as (R-m-GA). In the first stage, the candidate gene set is identified by applying the ReliefF. While, the second minimizes the redundancy with the help of mRMR method, which facilitates the selection of effectual gene subset from the candidate set. In the third stage, GA with classifier (used as a fitness function by the GA) is applied to choose the most discriminating genes. The proposed method is validated on the tumor datasets such as: CNS, DLBCL and Prostate cancer, using IB1 classifier. The comparative analysis of the R-m-GA against GA and ReliefF-GA has revealed that the proposed method is capable of finding the smallest gene subset that offers the highest classification accuracy.