文章基本信息

标题：Creating and Weighting Hunspell Dictionariesas Finite-State Automata
本地全文：下载
作者：Tommi Pirinen ; Krister Lindén
期刊名称：Investigationes Linguisticae (Online)
印刷版ISSN：1426-188X
电子版ISSN：1733-1757
出版年度：2017
卷号：21
页码：1
DOI：10.14746/il.2010.21.1
出版社：Adam Mickiewicz University
摘要：Therearenumerousformatsforwritingspell-checkersforopen-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into ﬁnite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spellcheckingsuggestionmechanismusingweightedﬁnite-statetechnology.Whatwe propose is a generic and efﬁcient language-independent framework of weighted ﬁnite-stateautomataforspell checkingintypicalopen-sourcesoftware,e.g.Mozilla Firefox, OpenOfﬁce and the Gnome desktop.