期刊名称:International Journal of Advanced Computer Research
印刷版ISSN:2249-7277
电子版ISSN:2277-7970
出版年度:2013
卷号:2013
出版社:Association of Computer Communication Education for National Triumph (ACCENT)
摘要:A handwritten character is represented as a sequence of strokes whose features are extracted and classified. Although the off-line and on-line character recognition techniques have different approaches, they share a lot of common problems and solutions. The printed documents available in the form of books, papers, magazines, etc. are scanned using standard scanners which produce an image of the scanned document. The preprocessed image is segmented using an algorithm which decomposes the scanned text into paragraphs using special space detection technique and then the paragraphs into lines using vertical histograms, and lines into words using horizontal histograms, and words into character image glyphs using horizontal histograms. Each image glyph is comprised of 24x24 pixels. Thus a database of character image glyphs is created out of the segmentation phase. The various features that are considered for classification are the character height, character width, the number of horizontal lines (long and short, image centroid and special dots. we proposed extracted features were passed to a Support Vector Machine (SVM) where the characters are classified by Supervised Learning Algorithm. These classes are mapped onto for recognition. Then the text is reconstructed using fonts
关键词:OCR; Features; Support Vector Machine (SVM); ;Artificial Neural Networks; Handwritten Character ;Recognition; Stroke; Printed Characters