首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Offensive language identification inDravidian code mixed social media text
  • 本地全文:下载
  • 作者:Sunil Saumya ; Abhinav Kumar ; Jyoti Prakash Singh
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:36-45
  • 语种:Italian
  • 出版社:ACL Anthology
  • 摘要:Hate speech and offensive language recognition in social media platforms have been an active field of research over recent years. In non-native English spoken countries, social media texts are mostly in code mixed or script mixed/switched form. The current study presents extensive experiments using multiple machine learning, deep learning, and transfer learning models to detect offensive content on Twitter. The data set used for this study are in Tanglish (Tamil and English), Manglish (Malayalam and English) code-mixed, and Malayalam script-mixed. The experimental results showed that 1 to 6-gram character TF-IDF features are better for the said task. The best performing models were naive bayes, logistic regression, and vanilla neural network for the dataset Tamil code-mix, Malayalam code-mixed, and Malayalam script-mixed, respectively instead of more popular transfer learning models such as BERT and ULMFiT and hybrid deep models.
国家哲学社会科学文献中心版权所有