文章基本信息

标题：Detecting Polarizing Language in Twitter using Topic Models and ML Algorithms
本地全文：下载
作者：Njagi Dennis Gitari ; Zhang Zuping ; WandabwaHerman 等
期刊名称：International Journal of Hybrid Information Technology
印刷版ISSN：1738-9968
出版年度：2016
卷号：9
期号：9
页码：211-222
出版社：SERSC
摘要：The upsurge in the use of social media in public discourses has made it possible for social scientists to engage in emerging and interesting areas of research. Normally, public debates tend to assume polar positions along political, social orideological lines. Generally, polarity in the language used is more of blaming the opposing group in such debates. In this paper, we investigated the detection of polarizing language in tweets in the event of a disaster. Our approach entails combining topic modeling and Machine Learning (ML) algorithms to generate topics that we consider to be polarized thereby classifying a given tweet as polar or not. Our latent Dirichlet allocation (LDA)-based model incorporates external resources in the form of a lexicon of blame-oriented words to induce the generation of polar topics. The Collapsed Gibbs sampling is used to infer new documents and to estimate the values of parameters employed in our model. We computed the log likelihood (LL) ratios using our model andtwo other state-of-the-art LDA-based models for evaluation. Furthermore, we compared polarized detection classification accuracy using the features extracted from polarized topics, bag of words (BOW) and part of speech (POS)-based features. Preliminary experiments returned higher overall accuracy results of 87.67% using topic-based features compared to BOW and POS-based features.
关键词：LDA topic modeling; blame topics; ML Algorithms; multilingual sentiments