首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Tb-SAC: Topic-based and Sentiment Classification for Saudi Dialects Tweets
  • 本地全文:下载
  • 作者:Sara Alzahrani ; Fatimah Alruwaili ; Dimah Alahmadi
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2020
  • 卷号:20
  • 期号:9
  • 页码:41-49
  • DOI:10.22937/IJCSNS.2020.20.09.6
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:Recently, sentiment analysis has received a lot of attention from researchers in text mining and data analysis. The studies have significantly expanded to include different languages from several sources that were employed to create a corpus to serve researchers in various shapes, sizes, and purposes. Locally, a lot of effort is spent on analyzing sentiment for Arabic texts, for both Modern Standard Arabic (MSA) and vernacular dialects. However, the researches concerned with creating a corpus based on the topic was relatively few. In this paper, we present Tb-SAC as extracted corpora from Twitter, especially from Saudi dialects. The corpus contains 4301 tweets, which labeled based on sentiments using a three-point scale: positive, negative, and neutral. The corpus classify based on tweet topics into five main topics obtained from analyzing the gold set with 200 tweets. The topics were Personal, Religion, Coronavirus, Entertainment, Other (Education, Economy, Sport, Food, Health, Social Media, Distance Working, Technology, Comedy, and Politics). Then, we performed the annotation process manually, besides applying eleven different classification models and validate the corpus by cross-validation model.
  • 关键词:Natural language processing (NLP); Sentiment analysis (SA); Topic-based; Saudi Dialects; Twitter; and Annotation;"
国家哲学社会科学文献中心版权所有