期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2014
卷号:10
期号:4
DOI:10.14445/22312803/IJCTT-V10P136
出版社:Seventh Sense Research Group
摘要:Clustering is the process of grouping of data, where the grouping is established by finding similarities between data based on their characteristics. Such groups are termed as Clusters. Clustering is an unsupervised learning problem that group objects based upon distance or similarity. While a lot of work has been published on clustering of data on storage medium, little has been done about automating this process. There should be an automatic and dynamic database clustering technique that will dynamically recluster a database with little intervention of a database administrator (DBA) and maintain an acceptable query response time at all times. A good physical clustering of data on disk is essential to reducing the number of disk I/Os in response to a query whether clustering is implemented by itself or coupled with indexing, parallelism, or buffering. In this paper we describe the issues faced when designing an automatic and dynamic database clustering technique for relational databases.. A comparative study of clustering algorithms across two different data items is performed here. The performance of the various clustering algorithms is compared based on the time taken to form the estimated clusters. The experimental results of various clustering algorithms to form clusters are depicted as a graph.