期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2015
卷号:4
期号:11
页码:11255
DOI:10.15680/IJIRSET.2015.0411065
出版社:S&S Publications
摘要:Text classification is an important task in many text mining applications. Text data generated from thereviews have been growing tremendously. People are participating largely in internet to give their opinion aboutvarious subjects and topics. A branch of text mining that deals with people’s views about a subject is opinion mining, inwhich the data in the form of reviews is mined in order to analyze their sentiment. This study of people’s opinion issentiment analysis and is a popular research area in text mining. In this paper, movie reviews are classified forsentiment analysis in weka. There are 2000 movie reviews in a dataset obtained from Cornell university datasetrepository. The dataset is preprocessed and various filters have been applied to reduce the feature set. Feature selectionmethods are widely used for gathering most valuable words for each category in text mining processes. They help tofind most distinctive words for each category by calculating some variables on data. The mostly employed methods areChi-Square, Information Gain, and Gain Ratio. In this study, information gain method was employed because of itssimplicity, less computational costs and its efficiency. The effects of reduced feature set have been proved to improvethe performance of the classifier. Two popular classifiers namely naïve bayes and svm have been experimented withthe movie review dataset. The results show that naïve bayes performs better than svm for classification of moviereviews.