期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2021
卷号:2021
页码:323-329
语种:English
出版社:ACL Anthology
摘要:This paper describes the models submitted by the team MUCS for Offensive Language Identification in Dravidian Languages-EACL 2021 shared task that aims at identifying and classifying code-mixed texts of three language pairs namely, Kannada-English (Kn-En), Malayalam-English (Ma-En), and Tamil-English (Ta-En) into six predefined categories (5 categories in Ma-En language pair). Two models, namely, COOLI-Ensemble and COOLI-Keras are trained with the char sequences extracted from the sentences combined with words as features. Out of the two proposed models, COOLI-Ensemble model (best among our models) obtained first rank for Ma-En language pair with 0.97 weighted F1-score and fourth and sixth ranks with 0.75 and 0.69 weighted F1-score for Ta-En and Kn-En language pairs respectively.