期刊名称:IAENG International Journal of Computer Science
印刷版ISSN:1819-656X
电子版ISSN:1819-9224
出版年度:2019
卷号:46
期号:2
页码:141-148
出版社:IAENG - International Association of Engineers
摘要:This paper proposes a method of linguistic classification based on the analysis of positive, negative and neutral sentiments expressed within text written in Vietnamese and English. It includes a process for document preparation and is based on the development of training data using Naïve Bayes classification in conjunction with a sentiment lexicon dictionary, thus reducing the size of the training corpus and limitation of using bag-of-words. Naïve Bayes, a machine learning and information mining algorithm, was chosen for its proven viability and its central role in data retrieval in general. The effectiveness of Naïve Bayes is further enhanced through the use of the dictionary as the input source, reducing the magnitude of the training corpus and consequently training time. In addition, the implementation of a document preparation process significantly improves accuracy to 98.2 % when compared with traditional Naïve Bayes (96.1%) and the lexical method (87.3 %).