首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:LUCIDAH Ligative and Unligative Characters in a Dataset for Arabic Handwriting
  • 本地全文:下载
  • 作者:Yousef Elarian ; Irfan Ahmad ; Abdelmalek Zidouri
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2019
  • 卷号:10
  • 期号:8
  • 页码:406-415
  • 出版社:Science and Information Society (SAI)
  • 摘要:Arabic script is inherently cursive, even when machine-printed. When connected to other characters, some Arabic characters may be optionally written in compact aesthetic forms known as ligatures. It is useful to distinguish ligatures from ordinary characters for several applications, especially automatic text recognition. Datasets that do not annotate these ligatures may confuse the recognition system training. Some popular datasets manually annotate ligatures, but no dataset (prior to this work) took ligatures into consideration from the design phase. In this paper, a detailed study of Arabic ligatures and a design for a dataset that considers the representation of ligative and unligative characters are presented. Then, pilot data collection and recognition experiments are conducted on the presented dataset and on another popular dataset of handwritten Arabic words. These experiments show the benefit of annotating ligatures in datasets by reducing error-rates in character recognition tasks.
  • 关键词:Arabic ligatures; automatic text recognition; handwriting datasets; Hidden Markov Models
国家哲学社会科学文献中心版权所有