期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2016
卷号:7
期号:1
DOI:10.14569/IJACSA.2016.070164
出版社:Science and Information Society (SAI)
摘要:India is a multilingual country with 22 official languages and more than 1600 languages in existence. Kannada is one of the official languages and widely used in the state of Karnataka whose population is over 65 million. Kannada is one of the south Indian languages and it stands in the 33rd position among the list of widely spoken languages across the world. However, the survey reveals that much more effort is required to develop a complete Optical Character Recognition (OCR) system. In this direction the present research work throws light on the development of suitable methodology to achieve the goal of developing an OCR. It is noted that the overall accuracy of the OCR system largely depends on the accuracy of the segmentation phase. So it is desirable to have a robust and efficient segmentation method. In this paper, a method has been proposed for proper segmentation of the text to improve the performance of OCR at the later stages. In the proposed method, the segmentation has been done using horizontal projection profile and windowing. The result obtained is passed to the recognition module. The Histogram of Oriented Gradient (HoG) is used for the recognition in combination with the support vector machine (SVM). The result is taken as the feedback and fed to the segmentation module to improve the accuracy. The experimentation is delivered promising results.
关键词:thesai; IJACSA; thesai.org; journal; IJACSA papers; Optical character recognition; Histogram of oriented gradients; relevance feedback; segmentation; Support Vector Machine; handwritten Kannada documents