首页    期刊浏览 2025年06月19日 星期四
登录注册

文章基本信息

  • 标题:Correlation Preserved Indexing Based Approach For Document Clustering
  • 本地全文:下载
  • 作者:Meena.S.U ; P.Parthasarathi
  • 期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
  • 印刷版ISSN:2278-1323
  • 出版年度:2013
  • 卷号:2
  • 期号:2
  • 页码:462-470
  • 出版社:Shri Pannalal Research Institute of Technolgy
  • 摘要:Document clustering is the act of collecting similar documents into clusters, where similarity is some function on a document. Document clustering method achieves 1) a high accuracy for documents 2) document frequency can be calculated 3) term weight is calculated with the term frequency vector. Document clustering is closely related to the concept of data clustering. Document clustering is a more specific technique for unsupervised document organization, automatic topic extraction and fast information retrieval or filtering. Clustering methods can be used to automatically group the retrieved documents into a list of meaningful categories. The correlation preserving indexing method is performed to find the correlation between the documents. The Term Frequency-Inverse Document Frequency (TF-IDF) method is used to find the frequency of occurrence of words in each document. The disadvantage of this method is computation complexity. In this paper Significant Score Calculation method is introduced, where similarity between the words are calculated using word net tool. Here the related words are identified. The 98% accuracy is occurred with significant score calculation for finding correlation preserving indexing.
  • 关键词:Correlation Preserving ; Indexing; Document Clustering; ; Significant Score Term Frequency- ; Inverse Document Frequency.
国家哲学社会科学文献中心版权所有