期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN:2277-6451
电子版ISSN:2277-128X
出版年度:2013
卷号:3
期号:6
出版社:S.S. Mishra
摘要:Optical Character Recognition (OCR) is the process of converting the textual image into the machine editable format. This paper proposes an OCR system for Complex printed Kannada Characters. The input to the system would be the scanned image of a page of text that containing complex Kannada characters and the output is a machine editable file. The system first pre-processes the input document containing the complex Kannada characters and converts it into binary form. Then the system extracts the lines from the document image and segments the lines into character and sub-character level pieces. Here histogram technique and connected component method is used for character segmentation and correlation method is used to recognize the characters. Here first we are collecting different sample characters and it is pre-processed and stores it in a file. The input image is segmented to character level pieces and it is compared with sample characters. It returns corresponding target ID. Each target ID has corresponding character class name. Then we are displaying the class name, which is in machine editable format.