首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Text Categorization of Movie Reviews for Sentiment Analysis
  • 本地全文:下载
  • 作者:Humera Shaziya ; G.Kavitha ; Raniah Zaheer
  • 期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
  • 印刷版ISSN:2347-6710
  • 电子版ISSN:2319-8753
  • 出版年度:2015
  • 卷号:4
  • 期号:11
  • 页码:11255
  • DOI:10.15680/IJIRSET.2015.0411065
  • 出版社:S&S Publications
  • 摘要:Text classification is an important task in many text mining applications. Text data generated from thereviews have been growing tremendously. People are participating largely in internet to give their opinion aboutvarious subjects and topics. A branch of text mining that deals with people’s views about a subject is opinion mining, inwhich the data in the form of reviews is mined in order to analyze their sentiment. This study of people’s opinion issentiment analysis and is a popular research area in text mining. In this paper, movie reviews are classified forsentiment analysis in weka. There are 2000 movie reviews in a dataset obtained from Cornell university datasetrepository. The dataset is preprocessed and various filters have been applied to reduce the feature set. Feature selectionmethods are widely used for gathering most valuable words for each category in text mining processes. They help tofind most distinctive words for each category by calculating some variables on data. The mostly employed methods areChi-Square, Information Gain, and Gain Ratio. In this study, information gain method was employed because of itssimplicity, less computational costs and its efficiency. The effects of reduced feature set have been proved to improvethe performance of the classifier. Two popular classifiers namely naïve bayes and svm have been experimented withthe movie review dataset. The results show that naïve bayes performs better than svm for classification of moviereviews.
  • 关键词:Text Classification; Opinion Mining; Sentiment Analysis; Feature Selection; Text Mining; Weka
国家哲学社会科学文献中心版权所有