期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2015
卷号:78
期号:2
出版社:Journal of Theoretical and Applied
摘要:Quick and high quality document clustering techniques play a vital role in text mining applications by grouping large text documents into meaningful clusters and enhancing the clustering accuracy using dimensionality reduction or query expansion. Detecting meaningful clusters and summaries in Distributed p2p network applies single document summarization techniques and peer relationships for detecting meaningful clusters and summaries. Traditional cluster based summarization methods usually suffer with the computation speed, compression, peer selection and sentence clustering in order to generate high quality summaries. Traditional document clustering and summarization methods assume node adjacency and neighborhood information to build clusters and summaries. Since the multilevel overlay p2p networks have suffered with node adjacency and duplicate information, it was difficult to generate optimal clusters and summaries within the peers. Proposed approach provides better solution to generate optimal document clustering using probabilistic k- representative clustering algorithm and forms efficient summaries using phrase rank based summarization. Experimental results give better performance in terms of execution time, entropy and cluster quality are concerned.