首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Semantics Based Clustering through Cover-Kmeans with OntoVsm for Information Retrieval
  • 本地全文:下载
  • 作者:R.Lakshmana Kumar ; N. Kannammal ; Sujatha Krishnamoorthy
  • 期刊名称:Public Policy And Administration
  • 印刷版ISSN:2029-2872
  • 出版年度:2020
  • 卷号:49
  • 期号:3
  • 页码:370-380
  • DOI:10.5755/j01.itc.49.3.25988
  • 出版社:Kaunas University of Technology
  • 摘要:Document clustering plays a significant task in the retrieval of the information, which seeks to divide documents into groups automatically, depending on their content similarity. The cluster consists of related documents within the group (having high intra-cluster similarity) and dissimilar to other group documents (having low inter-cluster similarity). Clustering documents should be considered an unsupervised process that aims to classify documents by identifying underlying structures, i.e. the learning process is unsupervised. So there is no need to determine the correct output for an input. Previous clustering methods do not know the semantic associations between words such that the context of documents cannot be correctly interpreted. In order to address this problem, the advent of semantic ontology information such as WordNet was widely used to enhance text clustering consistency. This paper initially proposes an OntoVSM model to reduce the dimension of the document efficiently. The cover K-means clustering algorithm is proposed for semantic document clustering. The proposed algorithm is a hybrid version of K-Means and covers coefficient-based clustering methodology (C3M) that is improved semantically using WordNet ontology. The dimensionality reduction based on semantic knowledge of each term preserves the information without loss. The performance of the proposed work is analysed through experimental results. This shows that the proposed work gives improved results compared to other standard methods.
  • 关键词:Semantic Clustering; dimension reduction; WordNet; semantic features.
国家哲学社会科学文献中心版权所有