文章基本信息

标题：Sentiment classification of Roman-Urdu opinions using Naive Bayesian, Decision Tree and KNN classification techniques
作者：Muhammad Bilal ; Huma Israr ; Muhammad Shahid 等
期刊名称：Journal of King Saud University @?C Computer and Information Sciences
印刷版ISSN：1319-1578
出版年度：2016
卷号：28
期号：3
页码：330-344
DOI：10.1016/j.jksuci.2015.11.003
出版社：Elsevier
摘要：Sentiment mining is a field of text mining to determine the attitude of people about a particular product, topic, politician in newsgroup posts, review sites, comments on facebook posts twitter, etc. There are many issues involved in opinion mining. One important issue is that opinions could be in different languages (English, Urdu, Arabic, etc.). To tackle each language according to its orientation is a challenging task. Most of the research work in sentiment mining has been done in English language. Currently, limited research is being carried out on sentiment classification of other languages like Arabic, Italian, Urdu and Hindi. In this paper, three classification models are used for text classification using Waikato Environment for Knowledge Analysis (WEKA). Opinions written in Roman-Urdu and English are extracted from a blog. These extracted opinions are documented in text files to prepare a training dataset containing 150 positive and 150 negative opinions, as labeled examples. Testing data set is supplied to three different models and the results in each case are analyzed. The results show that Naive Bayesian outperformed Decision Tree and KNN in terms of more accuracy, precision, recall and F-measure.
关键词：Naive Bayes ; Decision Tree ; k-Nearest Neighbor ; Roman Urdu ; Opinion mining ; Bag of words