文章基本信息

标题：Prepare Datasets In SQL For Data Mining Analysis In An Optimized Manner
本地全文：下载
作者：Srikanth Pasaragonda ; G. Charles Babu
期刊名称：International Journal of Computer Trends and Technology
电子版ISSN：2231-2803
出版年度：2013
卷号：4
期号：10-3
出版社：Seventh Sense Research Group
摘要：Generally collecting the information from databases for analysis is time taking and complex. In data mining projects analysis of data requires complex queries, aggregations, joining tables, maintaining primary and foreign keys. These make data analysis typical and time consuming. Existing SQL aggregations have some limitations to prepare data sets because they return scalar values on aggregation queries. In general, external effort is kept on creation of data sets at the time of horizontal layout is required. In this paper we are proposing simple, efficient methods make SQL code return multiple columns in horizontal aggregation tables. It will return set of values instead of one value for one aggregation query. These functions of class are called as horizontal aggregations. Horizontal aggregations generate data sets with standard layout which is required by most of the data mining projects. This layout inherits horizontal demoralized layout that includes pointdimension, instance feature, observation variable. Here we propose three basic methods to elaborate horizontal aggregation. First CASE Method, it derives the complete CASE construct. Second SPJ, it derived on standard relation algebra operations and third PIVOT, using this we can perform some DBMS offered operations. The CASE and PIVOT methods perform linear scalability, where SPJ does not perform. We performed the experimental evolutions on our proposed with PIVOT and SPJ. Proposed query evolution has shown similar speed to PIVOT operator, shown faster performance in SPJ method.
关键词：Aggregation; Data Preparation; Pivoting; SQL