期刊名称:International Journal of Computer Science and Network
印刷版ISSN:2277-5420
出版年度:2012
卷号:1
期号:6
出版社:IJCSN publisher
摘要:The data whose values are precise is known as certain datawhereas uncertain data is the data whose values are not precise. Itdoes mean that value of a data item is represented by multiplevalues. The traditional data mining algorithms, especiallyclassifiers work on certain data. They can’t handle uncertain data.This paper extends traditional decision tree classifiers to handlesuch data. We understood that the simple mean and median ofuncertain values can’t give accurate results. For this reason thispaper considers Probability Distribution Function (PDF) toimprove the accuracy of decision tree classifier. It also proposespruning techniques to improve the performance of the classifier.Empirical results show that, when compared to algorithms thatuse averages of uncertain values our algorithm is more accurate.However, it is computationally more expensive as it has tocompute PDFs. Our pruning techniques help in reducing thecomputational cost
关键词:Data mining; uncertain data; decision tree;classifiers; and pruning