首页    期刊浏览 2024年09月29日 星期日
登录注册

文章基本信息

  • 标题:Weighted random subspace method for high dimensional data classification
  • 本地全文:下载
  • 作者:Xiaoye Li ; Hongyu Zhao
  • 期刊名称:Statistics and Its Interface
  • 印刷版ISSN:1938-7989
  • 电子版ISSN:1938-7997
  • 出版年度:2009
  • 卷号:2
  • 期号:2
  • 页码:153-159
  • DOI:10.4310/SII.2009.v2.n2.a5
  • 出版社:International Press
  • 摘要:High dimensional data, especially those emerging from genomics and proteomics studies, pose significant challenges to traditional classification algorithms because the performance of these algorithms may substantially deteriorate due to high dimensionality and existence of many noisy features in these data. To address these problems, pre-classification feature selection and aggregating algorithms have been proposed. However, most feature selection procedures either fail to consider potential interactions among the features or tend to over fit the data. The aggregating algorithms, e.g. the bagging predictor, the boosting algorithm, the random subspace method, and the Random Forests algorithm, are promising in handling high dimensional data. However, there is a lack of attention to optimal weight assignments to individual classifiers and this has prevented these algorithms from achieving better classification accuracy. In this article, we formulate the weight assignment problem and propose a heuristic optimization solution. We have applied the proposed weight assignment procedures to the random subspace method to develop a weighted random subspace method. Several public gene expression and mass spectrometry data sets at the Kent Ridge biomedical data repository have been analyzed by this novel method. We have found that significant improvement over the common equal weight assignment scheme may be achieved by our method.
  • 关键词:classification; aggregating algorithm; voting weight; random subspace projection
国家哲学社会科学文献中心版权所有