首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Decision Trees for Uncertain Data
  • 本地全文:下载
  • 作者:Rohit Sutar ; Ashish Malunjkar ; Amit Kadam
  • 期刊名称:International Journal of Engineering and Computer Science
  • 印刷版ISSN:2319-7242
  • 出版年度:2015
  • 卷号:4
  • 期号:2
  • 页码:10321-10324
  • 出版社:IJECS
  • 摘要:Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiersto handle data with uncertain information. Value uncertainty arises in many applications during the data collection process.Example sources of uncertainty include measurement/quantization errors, data staleness, and multiple repeatedmeasurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple valuesforming a probability distribution. Rather than abstracting uncertain data by statistical derivatives (such as mean and median),we discover that the accuracy of a decision tree classifier can be much improved if the “complete information” of a data item(taking into account the probability density function (pdf)) is utilized.We extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensiveexperiments have been conducted that show that the resulting classifiers are more accurate than those using value averages.Since processing pdf’s is computationally more costly than processing single values (e.g., averages), decision tree constructionon uncertain data is more CPU demanding than that for certain data. To tackle this problem, we propose a series of pruningtechniques that can greatly improve construction efficiency.
  • 关键词:Uncertain Data; Decision Tree; Classification; Data Mining
国家哲学社会科学文献中心版权所有