首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:A Web Search Engine Application to Compare Semantic Equivalence between Two Words
  • 本地全文:下载
  • 作者:Mallela Haribabu ; H.Devaraj ; Y.Ramesh Kumar
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2014
  • 卷号:5
  • 期号:5
  • 页码:6572-6577
  • 出版社:TechScience Publications
  • 摘要:Measuring the semantic similarity between two words is an important component in various tasks on the web such as relation extraction, community mining, document clustering, and automatic meta-data extraction. Despite the usefulness of semantic similarity measures in these applications, accurately measuring semantic similarity between two words (or entities) remains a challenging task. We propose an empirical method to estimate semantic similarity using page counts and text snippets retrieved from a web search engine for two words. Specifically, we define various word co-occurrence measures using page counts and integrate those with lexical patterns extracted from text snippets. To identify the numerous semantic relations that exist between two given words, we propose a novel pattern extraction algorithm and a pattern clustering algorithm. The optimal combination of page counts-based co-occurrence measures and lexical pattern clusters is learned using support vector machines. The proposed method outperforms various baselines and previously proposed web-based semantic similarity measures on three benchmark data sets showing a high correlation with human ratings. Moreover, the proposed method significantly improves the accuracy in a community mining task.
  • 关键词:Web Mining; Information Extraction; Web Text;Analysis
国家哲学社会科学文献中心版权所有