首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:A Novel Document Weighted Approach for Text Classification
  • 本地全文:下载
  • 作者:S. Sai Satyanarayana Reddy
  • 期刊名称:Journal of Computers
  • 印刷版ISSN:1796-203X
  • 出版年度:2020
  • 卷号:15
  • 期号:3
  • 页码:105-113
  • DOI:10.17706/jcp.15.3.105-113
  • 出版社:Academy Publisher
  • 摘要:The textual data in the internet is increasing exponentially through blogs, twitter and various social media sites. The users are not specifying the type of text that they are uploading into the internet. In this regard most of the researchers are looking for automated tools for classifying the data or assigning class label to the unknown documents. Text classification is one such area used for classifying the texts. Several solutions were provided for text classification by the researchers. The text classification approaches generally contains collection of training data, preprocessing of the text, features extraction, feature reduction, document representation and finally applying classification algorithms to build the model for class label prediction of a new textual document. In the phases of text classification, the document representation is one important step to increase the efficiency of the accuracy of text classification. In this work, a new document representation approach is proposed. The experimentation conducted on 20-Newsgroup and Reuters-21578 datasets and different types of classification algorithms. Our approach attained best accuracy results for text classification and observed that the results are more promising than most of the popular approaches for text classification.
  • 其他关键词:Accuracy, bag of words model, document representation, document weight measure, term weight measure, text classification.
国家哲学社会科学文献中心版权所有