首页    期刊浏览 2024年09月29日 星期日
登录注册

文章基本信息

  • 标题:Booster in High Dimensional Data Classification
  • 作者:Shruti Hiremath ; Sheba Pari N ; Dr. S Mohan Kumar
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:3
  • 页码:5984
  • DOI:10.15680/IJIRCCE.2017.0503349
  • 出版社:S&S Publications
  • 摘要:Data Mining is a technique used in various domains to give mean- ing to the available data. Inclassification tree modeling the data is classified to make predictions about new data. Using old data to predict newdata has the danger of being too fitted on the old data. But that problem can be solved by pruning methods whichdegeneralizes the modelled tree. This paper describes the use of classification trees and shows two methods of pruningthem. An experiment has been set up using different kinds of classification tree algorithms with different pruningmethods to test the performance of the algorithms and pruning methods. This paper also analyzes data set properties tofind relations between them and the classification algorithms and pruning methods. Classification problems in highdimensional data with small number of observations are becoming more common especially in microarray data. Duringthe last two decades, lots of efficient classification models and feature selection (FS) algorithms have been proposed forhigher prediction accuracies. However, the result of an FS algorithm based on the prediction accuracy will be unstableover the variations in the training set, especially in high dimensional data. This paper proposes a new evaluationmeasure Q-statistic that incorporates the stability of the selected feature subset in addition to the prediction accuracy.
  • 关键词:Q-static; Data Mining; Feature Selection (FS).
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有