首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Retrieval of Spelling Variants in Nonstandard Texts – Automated Support and Visualization
  • 本地全文:下载
  • 作者:Thomas Pilz ; Wolfram Luther ; Ulrich Ammon
  • 期刊名称:SKY Journal of Linguistics
  • 印刷版ISSN:1456-8438
  • 电子版ISSN:1796-279X
  • 出版年度:2008
  • 卷号:21
  • 出版社:The Linguistic Association of Finland
  • 摘要:his article describes ongoing research in the RSNSR1 (Regelbasierte Suche in Textdatenbanken mit nichtstandardisierter Rechtschreibung, “Rule-based search in text databases with nonstandard orthography”) project. The focus of this project is making historical text documents digitally available; consequently, it examines the challenges for digitization procedures and subsequent retrieval operations, like fuzzy full-text search. Difficulties are posed by scans of low quality facsimiles, old font types, inconsistent transcriptions and especially typical optical character recognition (OCR) errors and spelling variation. This article discusses recent solutions to such problems, concentrating on stochastic string edit distance measures, so-called evidences and the avoidance of static dictionaries. By presenting visualization approaches for retrieval in and browsing of historical databases and nonstandard text documents, as well as a prototype for visual evaluation of distance measures, it proposes a progression of information visualization in linguistics
国家哲学社会科学文献中心版权所有