文章基本信息

标题：Clustering Of Web Usage Data Using Chameleon Algorithm
本地全文：下载
作者：T.Vijaya Kumar ; Dr. H.S.Guruprasad
期刊名称：International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN：2320-9798
电子版ISSN：2320-9801
出版年度：2014
卷号：2
期号：6
出版社：S&S Publications
摘要：Clustering is a discovery process in data mining which groups set of data items, in such a way thatmaximizes the similarity within clusters and minimizes the similarity between two different clusters. These discoveredclusters depict the characteristics of the underlying data distribution. Clustering is useful in characterizing customergroups based on purchasing patterns, categorizing web documents that have similar functionality. In this work, graphbasedclustering is proposed to form clusters based on web usage patterns. First sessions are constructed using timeoriented approach. Based on the constructed sessions and page requests, adjacency matrix is created. Then data pointsare generated using adjacency matrix as input. Chameleon clustering algorithm takes data points as input and formsclusters. Chameleon uses a two phase approach to find the clusters. In the first phase, it uses a graph partitioningalgorithm to cluster the data items into several relatively small sub-clusters. In the second phase, it uses an algorithm toform genuine clusters by repeatedly combining these sub clusters. Then these clusters are plotted on a plane usingMATLAB where different clusters are distinguished by distinct colours and distinct symbols. In this paper, the serverlog files of the Website www.enggresources.com is considered for overall study and analysis.
关键词：Web usage mining; K Nearest Neighbor graph; Chameleon algorithm