首页    期刊浏览 2025年08月25日 星期一
登录注册

文章基本信息

  • 标题:Text Document Clustering Using Semantic Neighbors
  • 本地全文:下载
  • 作者:Malihe Danesh ; Hossein Shirgahi
  • 期刊名称:Journal of Software Engineering
  • 印刷版ISSN:1819-4311
  • 电子版ISSN:2152-0941
  • 出版年度:2011
  • 卷号:5
  • 期号:4
  • 页码:136-144
  • DOI:10.3923/jse.2011.136.144
  • 出版社:Academic Journals Inc., USA
  • 摘要:Data clustering is a powerful technique for discovering knowledge from textual documents. In this field, K-means family algorithms have many applications because of simplicity and high speed in clustering of large scale data. In these algorithms, the criterion of cosine similarity only measures the pairwise similarity of documents that it doesn't have fine operation whenever the clusters are not properly separated. On the contrary, the concepts of Neighbors and Link with the spot of general information in calculating of closeness rate of two documents, in addition to pairwise similarity between them, have better operation. In this model, semantic relations between words have been ignored and only documents with the same terms have been clustered together. This study uses WordNet Ontology for making new model of documents representation that semantic relations between words for reweighing words frequency in documents vector space model, have been used and then Neighbors and Link concepts applied to this model. Results of using the proposed method (Semantic Neighbors) on real-world text data show better operation than previous methods and more efficient in text document clustering.
国家哲学社会科学文献中心版权所有