期刊名称:International Journal of Software Engineering and Its Applications
印刷版ISSN:1738-9984
出版年度:2014
卷号:8
期号:2
页码:11-20
DOI:10.14257/ijseia.2014.8.2.02
出版社:SERSC
摘要:In this paper, we propose a word recognition method for printed Arabic word images using HMM and Levenshtein Distance. The existing algorithm has the difficulty for Arabic text recognition to treat various fonts and sizes. This is because Arabic characters are cursive and each character may have up to four different shapes based on its location in a word. Our work begins with segmentation of a word into characters. Then each character is recognized individually using HMM classifier. Since the recognition of HMM is not accurate enough, we apply Levenshtein distance to correct misclassification and miss segmentation of a character in a word. Levenshtein distance works by comparing between recognized word and every words in a dictionary. We tested our proposed system with APTI dataset, and the achieved average recognition rates in more than 95% for six different fonts.