首页    期刊浏览 2024年11月15日 星期五
登录注册

文章基本信息

  • 标题:Clustering and Classification Augmented with Semantic Similarity for Text Mining
  • 本地全文:下载
  • 作者:S.Revathi ; T.Nalini
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2013
  • 卷号:10
  • 期号:2
  • 出版社:IJCSI Press
  • 摘要:Semantic similarity is a way of analyzing the perfect synonym that exists between word-pairs. This measure is necessary to detect the degree of relationship that persists within word-pairs. To compute the semantic similarity that lies between a word-pair, clustering and classification augmented with semantic similarity (CCASS) was developed. CCASS is a novel method that uses page counts and text snippets returned by search engine. Several similarity measures are defined using the page counts of word-pairs. Lexical pattern clustering is applied on text snippets, obtained from search engine. These are fed to the support vector machine (SVM) which computes the semantic similarity that exists between word-pairs. Based on this value obtained from the support vector machine, Simple KMeans clustering algorithm is used to form clusters. Upcoming word-pairs can be classified, after computation of its semantic similarity measure. If it does match with the existing clusters, a new cluster may be created.
  • 关键词:Semantic Similarity; Similarity measure; Clustering; Classification; Text mining.
国家哲学社会科学文献中心版权所有