文章基本信息

标题：A Novel Document Weighted Approach for Text Classification
本地全文：下载
作者：S. Sai Satyanarayana Reddy
期刊名称：Journal of Computers
印刷版ISSN：1796-203X
出版年度：2020
卷号：15
期号：3
页码：105-113
DOI：10.17706/jcp.15.3.105-113
出版社：Academy Publisher
摘要：The textual data in the internet is increasing exponentially through blogs, twitter and various social media sites. The users are not specifying the type of text that they are uploading into the internet. In this regard most of the researchers are looking for automated tools for classifying the data or assigning class label to the unknown documents. Text classification is one such area used for classifying the texts. Several solutions were provided for text classification by the researchers. The text classification approaches generally contains collection of training data, preprocessing of the text, features extraction, feature reduction, document representation and finally applying classification algorithms to build the model for class label prediction of a new textual document. In the phases of text classification, the document representation is one important step to increase the efficiency of the accuracy of text classification. In this work, a new document representation approach is proposed. The experimentation conducted on 20-Newsgroup and Reuters-21578 datasets and different types of classification algorithms. Our approach attained best accuracy results for text classification and observed that the results are more promising than most of the popular approaches for text classification.
其他关键词：Accuracy, bag of words model, document representation, document weight measure, term weight measure, text classification.