首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Short Text Classification Based on Improved ITC
  • 本地全文:下载
  • 作者:Liangliang Li ; Shouning Qu
  • 期刊名称:Journal of Computer and Communications
  • 印刷版ISSN:2327-5219
  • 电子版ISSN:2327-5227
  • 出版年度:2013
  • 卷号:01
  • 期号:04
  • 页码:22-27
  • DOI:10.4236/jcc.2013.14004
  • 语种:English
  • 出版社:Scientific Research Publishing
  • 摘要:The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the flaws of the conventional ITC algorithm, and then we present an improved ITC feature selection algorithm based on the characteristics of short text classification while combining the concepts of the Documents Distribution Entropy with the Position Distribution Weight. The improved ITC algorithm conforms to the actual situation of the short text classification. The experimental results show that the performance based on the new algorithm was much better than that based on the traditional TFIDF and ITC.
  • 关键词:ITC; Text Classification; Short Text
国家哲学社会科学文献中心版权所有