期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2017
卷号:14
期号:1
出版社:IJCSI Press
摘要:One of the important research areas in todays scenario is classification of Big Data. While there are a lot of traditional classification methods, extending them to Big Data is quite challenging. Decision Tree Classifier is one of the effective traditional classification techniques. The combination of Hadoop and Map Reduce has been adapted by many researchers both commercially and academically to process Big Data. Of late, Google cloud dataflow paradigm has sneaked into the Big Data scenario that augments the earlier systems with stream processing. This paper presents two algorithms based on Map Reduce and Google cloud data flow for implementing decision trees for classification is presented. The performances of both algorithms on various parameters have been compared and presented.
关键词:Decision Tree; Hadoop Distributed File System;Map Reduce Classifier; Pipeline Tree Classifier; Google Dataflow