文章基本信息

标题：JUNLP@DravidianLangTech-EACL2021: Offensive Language Identification inDravidian Langauges
本地全文：下载
作者：Avishek Garain ; Atanu Mandal ; Sudip Kumar Naskar 等
期刊名称：Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度：2021
卷号：2021
页码：319-322
语种：English
出版社：ACL Anthology
摘要：Offensive language identification has been an active area of research in natural language processing. With the emergence of multiple social media platforms offensive language identification has emerged as a need of the hour. Traditional offensive language identification models fail to deliver acceptable results as social media contents are largely in multilingual and are code-mixed in nature. This paper tries to resolve this problem by using IndicBERT and BERT architectures, to facilitate identification of offensive languages for Kannada-English, Malayalam-English, and Tamil-English code-mixed language pairs extracted from social media. The presented approach when evaluated on the test corpus provided precision, recall, and F1 score for language pair Kannada-English as 0.62, 0.71, and 0.66, respectively, for language pair Malayalam-English as 0.77, 0.43, and 0.53, respectively, and for Tamil-English as 0.71, 0.74, and 0.72, respectively.