期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2012
卷号:40
期号:2
页码:113-119
出版社:Journal of Theoretical and Applied
摘要:Recently in the classification and diagnosis of cancer nodules, Gene expression profiling by micro array techniques are playing a vital role. Various researchers have proposed a number of machine learning and data mining approaches for identifying cancerous nodule using gene expression data. But, these existing techniques have certain limitations that do not handle the particular needs of gene micro array examination. Initially, micro array data is featured by a high-dimensional feature space repeatedly exceeding the sample space dimensionality by a factor of 100 or higher. Moreover, micro array data consists of a high degree of noise. Most of the conventional approaches do not adequately handle with the limitations like dimensionality and noise. Gene ranking techniques are later proposed to overcome those problems. Some of the widely used Gene ranking techniques are T-Score, ANOVA, etc. But those approaches will sometimes wrongly predict the rank when large database is used. To overcome these issues, this paper proposes an efficient feature selection technique. Wrapper feature selection approach called the GA-SVM approach is used for the effective feature selection of genes. Then, the selected features are given as input to the classifier. The classifier used in the proposed technique is Support Vector Machine (SVM). The experiment is performed on lymphoma data set and the result shows the better accuracy of classification when compared to the standard SVM with T-Score method.
关键词:Feature subset Selection; GA-SVM; Support Vector Machine