首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:OffTamil@DravideanLangTech-EASL2021: Offensive Language Identification inTamil Text
  • 本地全文:下载
  • 作者:Disne Sivalingam ; Sajeetha Thavareesan
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:346-351
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:In the last few decades, Code-Mixed Offensive texts are used penetratingly in social media posts. Social media platforms and online communities showed much interest on offensive text identification in recent years. Consequently, research community is also interested in identifying such content and also contributed to the development of corpora. Many publicly available corpora are there for research on identifying offensive text written in English language but rare for low resourced languages like Tamil. The first code-mixed offensive text for Dravidian languages are developed by shared task organizers which is used for this study. This study focused on offensive language identification on code-mixed low-resourced Dravidian language Tamil using four classifiers (Support Vector Machine, random forest, k- Nearest Neighbour and Naive Bayes) using chiˆ2 feature selection technique along with BoW and TF-IDF feature representation techniques using different combinations of n-grams. This proposed model achieved an accuracy of 76.96% while using linear SVM with TF-IDF feature representation technique.
国家哲学社会科学文献中心版权所有