首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:IRNLP_DAIICT@DravidianLangTech-EACL2021:Offensive Language identification inDravidian Languages usingTF-IDFChar N-grams andMuRIL
  • 本地全文:下载
  • 作者:Bhargav Dave ; Shripad Bhat ; Prasenjit Majumder
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:266-269
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:This paper presents the participation of the IRNLPDAIICT team from Information Retrieval and Natural Language Processing lab at DA-IICT, India in DravidianLangTech-EACL2021 Offensive Language identification in Dravidian Languages. The aim of this shared task is to identify Offensive Language from a code-mixed data-set of YouTube comments. The task is to classify comments into Not Offensive (NO), Offensive Untargetede(OU), Offensive Targeted Individual (OTI), Offensive Targeted Group (OTG), Offensive Targeted Others (OTO), Other Language (OL) for three Dravidian languages: Kannada, Malayalam and Tamil. We use TF-IDF character n-grams and pretrained MuRIL embeddings for text representation and Logistic Regression and Linear SVM for classification. Our best approach achieved Ninth, Third and Eighth with weighted F1 score of 0.64, 0.95 and 0.71in Kannada, Malayalam and Tamil on test dataset respectively.
国家哲学社会科学文献中心版权所有