期刊名称:International Journal of Research in Computer Engineering & Electronics
印刷版ISSN:2319-376x
出版年度:2015
卷号:4
期号:2
语种:English
出版社:BHOPAL INSTITUTE OF PROFESSIONAL STUDIES
摘要:Abstract— The data mining is an essential tool for current age technology. That is very useful for various applications such as business intelligence, computational cloud and other research and science based projects. These projects need much accurate data analysis and problem solving technique in order to prevent the faults and misuse of data. In this presented work the decision tree based data mining algorithm is studied more specifically, the C5.0 algorithm. The C5.0 algorithm is an extension of the traditional ID3 algorithm. Traditionally that algorithm used an effective approach of data representation but that is not much accurate for classification process. Thus the improvement on the traditional C5.0 algorithm is introduced in this work. The proposed improvement of the C5.0 algorithm is performed on the basis of probability theory. Therefore the Bayesian classification algorithm is employed with the C5.0 algorithm. In order to combine both the classification technique first the training samples are analysed through the C5.0 algorithm and the decision tree is prepared. This decision tree is further converted into the decisional IF THAN ELSE rules. The Bayesian classifier is then trained using the extracted rules from the C5.0 algorithm and the trained classifier is used to improve the search time or decision time of the algorithm. The implementation of the proposed hybrid classification technique is performed using JAVA technology and their performance in terms of accuracy, error rate, and time and space based complexity is performed. According to the obtained results the proposed data model provides the efficient results as compared to the traditional data model but lacked somewhere in training time. In near future the proposed technique could be enhanced more by improving the training time of algorithm. Index Terms — Data Mining, Decision Trees, Classification, C5.0, Bayesian Classifier, Rules, Performance improvement.