首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Document Representation and Clustering with WordNet Based Similarity Rough Set Model
  • 本地全文:下载
  • 作者:Nguyen Chi Thanh ; Koichi Yamada
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2011
  • 卷号:8
  • 期号:5
  • 出版社:IJCSI Press
  • 摘要:Most studies on document clustering till date use Vector Space Model (VSM) to represent documents in the document space, where documents are denoted by a vector in a word vector space. The standard VSM does not take into account the semantic relatedness between terms. Thus, terms with some semantic similarity are dealt with in the same way as terms with no semantic relatedness. Since this unconcern about semantics reduces the quality of clustering results, many studies have proposed various approaches to introduce knowledge of semantic relatedness into VSM model. Those approaches give better results than the standard VSM. However they still have their own issues. We propose a new approach as a combination of two approaches, one of which uses Rough Sets theory and co-occurrence of terms, and the other uses WordNet knowledge to solve these issues. Experiments for its evaluation show advantage of the proposed approach over the others.
  • 关键词:document clustering; document representation; rough
国家哲学社会科学文献中心版权所有