出版社:Academy of Economic Studies - Bucharest, Romania
摘要:This paper explores how two main classical classification models work and generate predictions through a commercial solution of relational database management system (Microsoft SQL Server 2012). The aim of the paper is to accurately predict churn among a set of customers defined by various discrete and continuous variables, derived from three main data sources: the commercial transactions history; the users’ behavior or events happening on their computers; the specific identity information provided by the customers themselves. On a theoretical side, the paper presents the main concepts and ideas underlying the Decision Tree and Naïve Bayes classifiers and exemplifies some of them with actual hand-made calculations of the data being modeled by the software. On an analytical and practical side, the paper analyzes the graphs and tables generated by the classifying models and also reveal the main data insights. In the end, the classifiers’ accuracy is evaluated based on the test data method. The most accurate one is chosen for generating predictions on the customers’ data where the values of the response variable are not known.
关键词:Data Mining; Predictive Analytics; Classification; Decision Tree; Naïve Bayes; Churn Analysis; Microsoft SQL Server