期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2022
卷号:13
期号:3
DOI:10.14569/IJACSA.2022.0130330
语种:English
出版社:Science and Information Society (SAI)
摘要:The taxi services are growing rapidly as reliable services. The demand and competition between service providers is so high. A billion trip records need to be analyzed to raise the spirit of competition, understand the service users, and improve the business. Although decision tree classification is a common algorithm which generates rules that are easy to understand, there is no implementation for classification on taxi dataset. This research applies the decision tree classification model on taxi dataset to classify instances correctly, build a decision tree, and calculate accuracy. This experiment collected decision tree algorithm with Spark framework to present the good performance and high accuracy when predicting payment type. Applied decision tree algorithm with different aspects on NYC taxi dataset results in high accuracy.
关键词:Big data analytics; apache spark; decision tree classification; taxi trips; machine learning