摘要:Data mining is an important part of information management technology. Simply put, it is a method to extract and analyze meaningful patterns and correlations in a large relational database. In Data mining, Decision trees are one of the most worldwide used tools for decision support. In the emerging area of Data mining applications, users of data mining tools are faced with the problem of data sets that are comprised of large numbers of features and instances. Such kinds of data sets are not easy to handle for mining because decision trees generally depends on several parameters like dataset used and configuration of the tree itself among others in order to build an accurate model classification. In this work a novel hybrid classifier system is presented for improving accuracy of decision trees using clustering techniques. This system is formed by a clustering algorithm, a decision tree and an optional module for identifying appropriate parameters for the clustering algorithm. These three modules working together are capable to increase the accuracy of the solutions. The validation of the results of this work has been performed using several well-known datasets and applying two decision trees algorithms. The accuracy percentages are compared in order to show our proposal improvement, obtaining good results. Finally two clustering algorithms have been used to compare the accuracy between different proposals
关键词:accuracy improvement; clustering; decision tree