首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Influence of Data Geometry in Random Subset Feature Selection
  • 本地全文:下载
  • 作者:D. Lakshmi Padmaja ; B. Vishnuvardhan
  • 期刊名称:International Journal of Data Mining & Knowledge Management Process
  • 印刷版ISSN:2231-007X
  • 电子版ISSN:2230-9608
  • 出版年度:2017
  • 卷号:7
  • 期号:4
  • 页码:33
  • DOI:10.5121/ijdkp.2017.7403
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:The geometry of data, also known as probability distribution, is an important consideration for accurate computation of data mining tasks, such as pre-processing, classification and interpretation. The data geometry influences outcome and accuracy of the statistical analysis to a large extent. The current paper focuses on, understanding the influence of data geometry in the feature subset selection process using random forest algorithm. In practice, it is assumed that the data follows normal distribution and most of the time, it may not be true. The dimensionality reduction varies, due to change in the distribution of the data. A comparison is made using three standard distributions such as Triangular, Uniform and Normal Distribution. The results are discussed in this paper.
  • 关键词:Data Geometry; Gaussian Distribution; Uniform Distribution; Triangular Distribution; Dimensionality ;Reduction; Random Forest; Random Subset Feature Selection.
国家哲学社会科学文献中心版权所有