首页    期刊浏览 2024年11月10日 星期日
登录注册

文章基本信息

  • 标题:Finite-state script normalization and processing utilities: TheNisabaBrahmic library
  • 本地全文:下载
  • 作者:Cibu Johny ; Lawrence Wolf-Sonkin ; Alexander Gutkin
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:14-23
  • DOI:10.18653/v1/2021.eacl-demos.3
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:This paper presents an open-source library for efficient low-level processing of ten major South Asian Brahmic scripts. The library provides a flexible and extensible framework for supporting crucial operations on Brahmic scripts, such as NFC, visual normalization, reversible transliteration, and validity checks, implemented in Python within a finite-state transducer formalism. We survey some common Brahmic script issues that may adversely affect the performance of downstream NLP tasks, and provide the rationale for finite-state design and system implementation details.
国家哲学社会科学文献中心版权所有