摘要:Decision tree is one of the most effective and widely used models for classification and ranking and has received a great deal of attention from researchers in the domain of data mining and machine learning. A critical problem in decision tree learning is how to estimate the class-membership probabilities from decision trees. In this paper, we firstly survey all kinds of class probability estimation methods, mainly include the maximum-likelihood estimate, the Laplace estimate, the m-estimate, the similarity-weighted estimate, the naive Bayes-based estimate, and so on. Then, we provide an empirical study on the classification and ranking performance of the resulting decision trees using different class probability estimation methods. The experimental results based on a large number of UCI data sets verify our conclusions.
关键词:decision tree learning;probability estimation tree;class probability estimation;classification;ranking.