期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2012
卷号:3
期号:3
页码:4480-4485
出版社:TechScience Publications
摘要:Classification is one of the most efficient and widely used data mining technique. In classification, Decision trees can handle high dimensional data, and their representation is intuitive and generally easy to assimilate by humans. Decision trees handle the data whose values are certain. We extend such classifiers i.e, decision trees to handle uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution (pdf’s). Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted that show that the resulting classifiers are more accurate than those using value averages.