期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:16
出版社:Journal of Theoretical and Applied
摘要:Analysis of early cancer prognosis is necessary to determine the proper treatment for each patient. Furthermore, as microarray DNA has high dimensional data it would lead to a challenging task. Several studies in high dimensionality reduction have been conducted to determine significant genes with least error in cancer classification. One of those studies implements mining process such as feature selection using parametric and non-parametric statistical tests. Other than feature selection, data integration is also believed as an optimal solution in increasing cancer classification performance. In this paper, dataset containing gene expression value and clinical parameters observed from 60 breast cancer patients is used for experiment. The experiment consists of integrating data using early kernel based data integration model with modification in its dimensionality reduction step. In the existing related research, kernel dimensionality reduction is used. In this paper, mining process using several parametric and non-parametric based statistical tests is used as the replacement of kernel dimensionality reduction. The last step in kernel based data integration is classification using Support Vector Machine (SVM). Ten-fold cross validation scheme is used in the experiment. SVM with linear kernel gives the best accuracy rate compared to other kernels.
关键词:Recurrent Cancer; Data Integration; Kernel Method; Kernel Dimensionality; Gene Expressions