期刊名称:International Journal of Software Engineering and Its Applications
印刷版ISSN:1738-9984
出版年度:2015
卷号:9
期号:2
页码:251-260
DOI:10.14257/ijseia.2015.9.2.22
出版社:SERSC
摘要:We present a method of automatically constructing a domain-specific Korean sentiment dictionary which can be used to classify the sentiment of online movie reviews. More than 1.18 million online movie reviews with movie ratings ranging between 1 to 4 and 7 to 10 were collected across fourteen different movie genres to calculate the joint probability of a given word and the sentiment of movie reviews for each genre. In particular, the joint probability of (1) a given word and the positive movie reviews that contain movie ratings 7 to 10 and (2) a given word and the negative movie reviews that contain movie ratings 1 to 4 for each movie genre were calculated. The difference between the two joint probabilities (i.e., (1) – (2)) was obtained for each word in each genre, and the fourteen genres' joint probability differences of each word were averaged. Finally, the averaged joint probability difference values were normalized to range between - 1 and 1. These normalized values were utilized as the sentiment values of each word in the final 135,082-word movie domain Korean sentiment dictionary. The positive/negative binary sentiment classification performance of the constructed sentiment dictionary was evaluated using test data, and the balanced accuracy of 80.7% was achieved, confirming the effectiveness of the proposed sentiment dictionary construction method.
关键词:Korean Sentiment Dictionary; Online Movie Reviews; Sentiment ; Classification