期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2017
卷号:95
期号:19
页码:5047
出版社:Journal of Theoretical and Applied
摘要:Due to the large volume of Arabic texts in many generated and historical documents, it is essential to use computers in order to make generated texts editable, this is actually the main task of Arabic Object Character Recognition (OCR) systems. The task of automatically OCRing is to type documents within close-to-human performance, such OCR system is still an open research problem. In this paper, we propose an Arabic OCR based on Dynamic Time Warping (DTW) algorithm that is empowered to properly recognize Arabic words. Rather than using the usual practice of character segmentation, this paper proposes a segmentation of Arabic texts into lines and characters. The proposed Arabic OCR algorithm overlaps the segmentation and the recognition processes � an online segmentation-recognition. That is, in order to overcome the challenges of segmenting highly cursive Arabic texts into isolated characters. The accuracy of the proposed Arabic OCR algorithm is tested on randomly selected articles from Jordanian newspapers. Interestingly, results demonstrate the robustness of our proposed Arabic OCR algorithm that achieves 96.2% character recognition accuracy in the worst case.
关键词:Object Character Recognition; Dynamic Time Warping; Online Arabic OCR; Typed Arabic OCR