首页    期刊浏览 2024年09月19日 星期四
登录注册

文章基本信息

  • 标题:A Web Search Engine based approach to Measure the Semantic Similarity between Words using Page Count and SnippetsMethod (PCSM)
  • 本地全文:下载
  • 作者:Vaishali Nirgude ; Rekha Sharma ; R.R.Sedamkar
  • 期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
  • 印刷版ISSN:2278-1323
  • 出版年度:2013
  • 卷号:2
  • 期号:7
  • 页码:2252-2257
  • 出版社:Shri Pannalal Research Institute of Technolgy
  • 摘要:Semantic similarity measures play an important role in Information Retrieval, Natural Language Processing and Web Mining applications such as community mining, relation detection, entity disambiguation and document clustering etc. This paper proposes Page Count and Snippets Method (PCSM) to estimate semantic similarity between any two words (or entities) based on page counts and text snippets retrieved from a web search engine. It defines five page count based concurrence measures and integrates them with lexical patterns extracted from text snippets. A lexical pattern extraction algorithm is proposed to identify the semantic relations that exist between any query word pair. Similarity score of both methods are integrated by using Support Vector Machine (SVM) to get optimal results. The proposed method is compared with Miller and Charles (MC) benchmark data sets and the performance is measured by using Pearson correlation value. The correlation value of proposed method is 0.8960% which is higher than existing methods. The PCSM also evaluates semantic relations between named entities to improve Precision, Recall and F-score.
  • 关键词:Community Mining; Information Retrieval; ; Lexical Patterns; Page Counts; Text Snippets; Correlation
国家哲学社会科学文献中心版权所有