首页    期刊浏览 2025年03月15日 星期六
登录注册

文章基本信息

  • 标题:Keyword extraction from single documents using mean word intermediate distance
  • 本地全文:下载
  • 作者:Sifatullah Siddiqi ; Aditi Sharan
  • 期刊名称:International Journal of Advanced Computer Research
  • 印刷版ISSN:2249-7277
  • 电子版ISSN:2277-7970
  • 出版年度:2016
  • 卷号:6
  • 期号:25
  • 页码:138-145
  • 出版社:Association of Computer Communication Education for National Triumph (ACCENT)
  • 摘要:Keyword extraction is an important task in text mining. In this paper a novel, unsupervised, domain independent and language independent approach for automatic keyword extraction from single documents have been proposed. We have used the word intermediate distance vector and its mean value to extract keywords. We have compared our approach with results from the standard deviation of intermediate distances approach as standard and found that there is heavy overlapping between the results of both approaches with the advantage that our approach is faster, especially in case of long documents as it removes the need to compute the standard deviation of word intermediate distance vector. Two famous works viz. “Origin of Species” and “A Brief History of Time” to demonstrate the experimental results have been used. Experiments show that the proposed approach works almost as better as the standard deviation approach and the percentage overlap between top 30 extracted keywords is more than 50%.
  • 关键词:Keyword extraction; Word means intermediate distance; Clustering; Standard deviation.
国家哲学社会科学文献中心版权所有