首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Enhancement of C5.0 Classifier by Using Bayesian Probability Theory
  • 本地全文:下载
  • 作者:Sonam Mehta ; Deepak Shukla
  • 期刊名称:International Journal of Research in Computer Engineering & Electronics
  • 印刷版ISSN:2319-376x
  • 出版年度:2015
  • 卷号:4
  • 期号:2
  • 语种:English
  • 出版社:BHOPAL INSTITUTE OF PROFESSIONAL STUDIES
  • 摘要:Abstract— The data mining is an essential tool for current age technology. That is very useful for various applications such as business intelligence, computational cloud and other research and science based projects. These projects need much accurate data analysis and problem solving technique in order to prevent the faults and misuse of data. In this presented work the decision tree based data mining algorithm is studied more specifically, the C5.0 algorithm. The C5.0 algorithm is an extension of the traditional ID3 algorithm. Traditionally that algorithm used an effective approach of data representation but that is not much accurate for classification process. Thus the improvement on the traditional C5.0 algorithm is introduced in this work. The proposed improvement of the C5.0 algorithm is performed on the basis of probability theory. Therefore the Bayesian classification algorithm is employed with the C5.0 algorithm. In order to combine both the classification technique first the training samples are analysed through the C5.0 algorithm and the decision tree is prepared. This decision tree is further converted into the decisional IF THAN ELSE rules. The Bayesian classifier is then trained using the extracted rules from the C5.0 algorithm and the trained classifier is used to improve the search time or decision time of the algorithm. The implementation of the proposed hybrid classification technique is performed using JAVA technology and their performance in terms of accuracy, error rate, and time and space based complexity is performed. According to the obtained results the proposed data model provides the efficient results as compared to the traditional data model but lacked somewhere in training time. In near future the proposed technique could be enhanced more by improving the training time of algorithm. Index Terms — Data Mining, Decision Trees, Classification, C5.0, Bayesian Classifier, Rules, Performance improvement.
国家哲学社会科学文献中心版权所有