首页    期刊浏览 2024年07月05日 星期五
登录注册

文章基本信息

  • 标题:An Optimized WebDocument Clustering Using Recurrent Set IGA & Confusion Matrix For Fact Retrieval
  • 本地全文:下载
  • 作者:C. Josephine Christy ; Dr. B. Nagarajan
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2013
  • 卷号:1
  • 期号:10
  • 出版社:S&S Publications
  • 摘要:Initially the first phase derives the Genetic Algorithm for global clustering process to resolve theoptimization solution in both clustering and feature selection. The second phase follows a concept of confusion matrixfor derivative works and improved GA is included for the final classification. The third phase presents the optimizationtechnique to evaluate the cluster optimality for proficient document clustering based on the optimized conceptualfeature words. Final phase introduce a join approach to cluster the web pages which primarily finds the recurrent setsand then clusters the documents. These recurrent sets are generated by using recurrent pattern expansion technique.Then by applying Fuzzy K-Means algorithm on Optimized Web document clustering using Recurrent Set foundsclusters having documents which are extremely related and have related features. Experimental results show that ourapproach is more efficient then the above two join approach and can handle more efficiently in robust nature.Performance evaluation show benefits in terms of cluster optimality, true negative rate and information retrieval on realand UCI repository bag of words dataset.
  • 关键词:Genetic Algorithm; Fuzzy K-Means Algorithm; Recurrent Pattern Expansion; Web document Clustering;Confusion Matrix; Optimization Technique; World Wide Web; Feature Selection
国家哲学社会科学文献中心版权所有