文章基本信息

标题：Datasets Preparation Using SQL Aggregation
本地全文：下载
作者：Ch. Kalyani ; V. Durga Prasad ; P. Suresh Babu 等
期刊名称：International Journal of Computer Science & Technology
印刷版ISSN：2229-4333
电子版ISSN：0976-8491
出版年度：2012
卷号：3
期号：2
页码：1094-1097
语种：English
出版社：Ayushmaan Technologies
摘要：To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is more complex task in a data mining project, requires many SQL queries, joining tables and aggregating columns. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout returns set of numbers, instead of one number per row. The system uses one parent table and different child tables, operations are then performed on the data loaded from multiple tables. In a relational database, especially with normalized tables, a significant effort is required to prepare a summary data set [16] that can be used as input for a data mining or statistical algorithm. Association rule mining searches for interesting relationships among items in a given data set. Preparing a data set for analysis is generally the most time consuming task in a data mining project, requiring many complex SQL queries, joining tables and aggregating columns. Basically, a horizontal aggregation returns a set of numbers instead of a single number for each group, resembling a multi-dimensional vector. We proposed an abstract, but minimal, extension to SQL standard aggregate functions to compute horizontal aggregations We propose two fundamental methods to evaluate horizontal aggregations: CASE: Exploiting the programming CASE construct; SPJ: Based on standard relational algebra operators (SPJ queries); which is offered by some DBMSs.