文章基本信息

标题：Implementing an Efficient Task to Build Data Sets for Datamining Analysis
本地全文：下载
作者：B.K.Manasa ; H.Venkateswara Reddy
期刊名称：International Journal of Computer Science and Information Technologies
电子版ISSN：0975-9646
出版年度：2014
卷号：5
期号：5
页码：6198-6201
出版社：TechScience Publications
摘要：Data mining is the process of discovering actionable information from large sets of data. Preparing a data set for analysis is generally the most time consuming task in a data mining project, requiring many complex sql queries, joining tables and aggregating columns. Existing sql aggregations have limitations to prepare data sets because they return one column per aggregated group. This paper presents Methodology of Horizontal aggregations using a method to generate sql code to return aggregated columns in a horizontal tabular layout, returning a set of numbers instead of one number per row. There are three fundamental variations to evaluate horizontal aggregations. CASE: Exploiting the programming CASE construct; SPJ: Based on standard relational algebra operators (SPJ queries); PIVOT: Using the PIVOT operator, which is offered by some DBMSs. The proposed methodology shows evaluating horizontal aggregations is a challenging and interesting problem and introduces alternative methods and optimizations for their efficient evaluation.
关键词：aggregation; data preparation; pivoting; SQL