期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2015
卷号:6
期号:3
页码:2923-2927
出版社:TechScience Publications
摘要:Social Networking sites provides tremendous impetus for Big Data in mining people’s opinion. Public API’s catered by sites such as Twitter provides us with useful data for perusing writer’s attitude apropos of a particular topic, product etc. To discern people’s opinion, tweets are tagged into positive, negative or neutral indicators. This paper provides an effective mechanism to perform opinion mining by designing a end to end pipeline with the help of Apache Flume ,Apache HDFS, Apache Oozie and Apache Hive. To make this process near real time we study the workaround of ignoring Flume tmp files and removing default wait condition from Oozie job configuration. The underlying architecture employed here is not restricted only to opinion mining but also has a gamut of applications. This paper explores few of the use cases that can be developed into actual working models.
关键词:Opinion Mining; Big Data; Real time Tweet;Analysis; Oozie workaround.