首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Document Topic Generation in Text Mining by using Cluster Analysis with EROCK
  • 本地全文:下载
  • 作者:Mr. Rizwan Ahmad ; Mr. Aasia Khanum
  • 期刊名称:International Journal of Computer Science and Security (IJCSS)
  • 电子版ISSN:1985-1553
  • 出版年度:2010
  • 卷号:4
  • 期号:2
  • 页码:176-182
  • 出版社:Computer Science Journals
  • 摘要:Clustering is useful technique in the field of textual data mining. Cluster analysis divides objects into meaningful groups based on similarity between objects. Copious material is available from the World Wide Web (WWW) in response to any user-provided query. It becomes tedious for the user to manually extract real required information from this material. This paper proposes a scheme to effectively address this problem with the help of cluster analysis. In particular, the ROCK algorithm is studied with some modifications. ROCK generates better clusters than other clustering algorithms for data with categorical attributes. We present an enhanced version of ROCK called Enhanced ROCK (EROCK) with improved similarity measure as well as storage efficiency. Evaluation of the proposed algorithm done on standard text documents shows improved performance.
  • 关键词:Text Mining; Cluster Analysis; Document Similarity
国家哲学社会科学文献中心版权所有