首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Text Classification: Naive Bayes Classifier with Sentiment Lexicon
  • 本地全文:下载
  • 作者:Cong-Cuong Le ; P.W.C. Prasad ; Abeer Alsadoon
  • 期刊名称:IAENG International Journal of Computer Science
  • 印刷版ISSN:1819-656X
  • 电子版ISSN:1819-9224
  • 出版年度:2019
  • 卷号:46
  • 期号:2
  • 页码:141-148
  • 出版社:IAENG - International Association of Engineers
  • 摘要:This paper proposes a method of linguistic classification based on the analysis of positive, negative and neutral sentiments expressed within text written in Vietnamese and English. It includes a process for document preparation and is based on the development of training data using Naïve Bayes classification in conjunction with a sentiment lexicon dictionary, thus reducing the size of the training corpus and limitation of using bag-of-words. Naïve Bayes, a machine learning and information mining algorithm, was chosen for its proven viability and its central role in data retrieval in general. The effectiveness of Naïve Bayes is further enhanced through the use of the dictionary as the input source, reducing the magnitude of the training corpus and consequently training time. In addition, the implementation of a document preparation process significantly improves accuracy to 98.2 % when compared with traditional Naïve Bayes (96.1%) and the lexical method (87.3 %).
  • 关键词:Vietnamese; Sentiment lexicon; Naïve Bayes; Machine learning; Classification document; Probability; Preparation; Tokenization; Stop-word
国家哲学社会科学文献中心版权所有