首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Domain Keyword Extraction Technique: A New Weighting Method Based on Frequency Analysis
  • 本地全文:下载
  • 作者:Rakhi Chakraborty
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2013
  • 卷号:3
  • 期号:2
  • 页码:109-118
  • DOI:10.5121/csit.2013.3211
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:On-line text documents rapidly increase in size with the growth of World Wide Web. To manage such a huge amount of texts,several text miningapplications came into existence. Those applications such as search engine, text categorization, summarization, and topic detection are based on feature extraction.It is extremely time consuming and difficult task to extract keyword or feature manually.So an automated process that extracts keywords or features needs to be established.This paper proposes a new domain keyword extraction technique that includes a new weighting method on the base of the conventional TF•IDF. Term frequency-Inverse document frequency is widely used to express the documentsfeature weight, which can’t reflect the division of terms in the document, and then can’t reflect the significance degree and the difference between categories. This paper proposes a new weighting method to which a new weight is added to express the differences between domains on the base of original TF•IDF.The extracted feature can represent the content of the text better and has a better distinguished ability.
  • 关键词:Text mining;Feature extraction;weighting method; Term Frequency Inverse Document ;Frequency (TF•IDF); Domain keyword extraction.
国家哲学社会科学文献中心版权所有