首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Modified TF-Assoc Term Weighting Method for Text Classification on News Dataset from Twitter
  • 本地全文:下载
  • 作者:Imroatul Khuluqi Izzah ; Abba Suga Girsang
  • 期刊名称:IAENG International Journal of Computer Science
  • 印刷版ISSN:1819-656X
  • 电子版ISSN:1819-9224
  • 出版年度:2021
  • 卷号:48
  • 期号:1
  • 语种:English
  • 出版社:IAENG - International Association of Engineers
  • 摘要:Text classification is a process of locating text documents automatically into categories based on the text content. In-text classification, there is a stage that has an important role in giving the value of importance to each document, that is term weighting. In the researchers’ previous study, a new supervised term weighting (TF-Assoc) was introduced with the concept of association to optimize term weighting distribution in a case of multiclass classification. To improve the performance of text categorization, this paper proposes a term weighting scheme with a modified association concept, that is mTF-IDF-Assoc. The proposed term weighting scheme considered Document Length (DL). DL was used to normalize the term frequency by dividing it by the length of the document's vector and then formulting IDF and Assoc in calculating the weight of each word. The results showed that mTF-IDF-Assoc implemented with SVM classifier and 10-fold cross-validation technique could outperform the TF-IDF, TF?ICF, and TF-Assoc weighting scheme with an average accuracy of 82.322%.
  • 关键词:text classification;document length;supervised term weighting;association;confidence
国家哲学社会科学文献中心版权所有