首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Lexicon-Driven Word Recognition Based on Levenshtein Distance
  • 本地全文:下载
  • 作者:Perdana Adhitama ; Soo Hyung Kim ; In Seop Na
  • 期刊名称:International Journal of Software Engineering and Its Applications
  • 印刷版ISSN:1738-9984
  • 出版年度:2014
  • 卷号:8
  • 期号:2
  • 页码:11-20
  • DOI:10.14257/ijseia.2014.8.2.02
  • 出版社:SERSC
  • 摘要:In this paper, we propose a word recognition method for printed Arabic word images using HMM and Levenshtein Distance. The existing algorithm has the difficulty for Arabic text recognition to treat various fonts and sizes. This is because Arabic characters are cursive and each character may have up to four different shapes based on its location in a word. Our work begins with segmentation of a word into characters. Then each character is recognized individually using HMM classifier. Since the recognition of HMM is not accurate enough, we apply Levenshtein distance to correct misclassification and miss segmentation of a character in a word. Levenshtein distance works by comparing between recognized word and every words in a dictionary. We tested our proposed system with APTI dataset, and the achieved average recognition rates in more than 95% for six different fonts.
  • 关键词:Hidden Markov Model; Printed Arabic Word Recognition; OCR; Levenshtein ; distance; Segmentation
国家哲学社会科学文献中心版权所有