首页    期刊浏览 2025年07月26日 星期六
登录注册

文章基本信息

  • 标题:AN IMPROVED TEXT CLASSIFICATION METHOD BASED ON GINI INDEX
  • 本地全文:下载
  • 作者:XIAOQIANG JIA ; JIANGYAN SUN
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2012
  • 卷号:43
  • 期号:2
  • 页码:267-273
  • 出版社:Journal of Theoretical and Applied
  • 摘要:In text classification, the purity of the Gini index can be used. When purity value is greater, the characteristic of the information contained in the attribute is higher, and the feature distinguishing capability is stronger. But using the Gini purity formula on feature weight, the classification result is not very good, one of the main reasons is those rare words only appearing in one category and not appearing in other categories can not be strictly differentiated. In order to solve this problem, On the basis of Gini index, an improved feature weight method based on Gini index has proposed. By introducing the approximation quality of features term in the categories, according to the category distinguishing ability adjust term weight, using the purity formula feature weight comparison, the above problem is well solved, which can effectively improve the performance of text classification. The experiments have verified the feasibility of the proposed method.
  • 关键词:Gini Index; Approximation Quality; Term Weigh; Text Classification
国家哲学社会科学文献中心版权所有