首页    期刊浏览 2025年06月18日 星期三
登录注册

文章基本信息

  • 标题:Integrating the Principal Component Analysis with Partial Decision Tree in Microarray Gene Data
  • 本地全文:下载
  • 作者:Mohammad Subhi Al-Batah
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2019
  • 卷号:19
  • 期号:3
  • 页码:24-29
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:In microarray cancer datasets, the gene analysis and classification is an imperative task because gene expression data have large dimensionalities, contain redundant information, irrelevant features and noises. Therefore, the main contribution of this paper is selecting a concise subset of informative genes, for improving processing speed and prediction performance. A two-phase hybrid approach is proposed which combines the Principal Component Analysis (PCA) algorithm with Partial Decision Tree (PART) rules. The PCA is applied to identify a small set with most discriminating genes, while the PART rules is proposed to classify microarray data into two or multi-classes. Eleven datasets that consists of different classes, and genes are used, which are Breast Cancer, CNS, Colon, Leukemia, Leukemia_3C, Leukemia_4C, Lung, Lymphoma, MLL, Ovarian, and SRBCT. The data analysis is conducted by using the full training method and the cross validation technique 2-folds to 10-folds. Experimental analysis shows that gene selection using PCA method reduced the computational complexity and obtained the smallest subset of genes prior to classification. Also, it was noticed that the PART classifier when combined with PCA algorithm works faster and showed a remarkable improvement in the classification accuracy.
  • 关键词:Principal Component Analysis (PCA) algorithm; Partial Decision Tree (PART) rules; Microarray data; Classi?cation; Gene selection; Data mining
国家哲学社会科学文献中心版权所有