文章基本信息

标题：Handwritten Numeral/mixed Numerals Recognition of South-indian Scripts: The Zone-based Feature Extraction Method
本地全文：下载
作者：S.V. Rajashekararadhya ; Dr P. Vanaja Ranjan
期刊名称：Journal of Theoretical and Applied Information Technology
印刷版ISSN：1992-8645
电子版ISSN：1817-3195
出版年度：2009
卷号：7
期号：01
出版社：Journal of Theoretical and Applied
摘要：
Handwriting recognition has always been a challenging task in image processing and pattern recognition. There are five major stages in the handwritten character recognition problem: Image processing, segmentation, feature extraction, training and recognition, and preprocessing. India is a multi-lingual multi-script country, where eighteen official scripts are accepted and there are over a hundred regional languages. In this paper we propose a zone-based feature extraction algorithm scheme for the recognition of off-line handwritten numerals of four popular Indian scripts. The character centroid is computed and the character/numeral image (50x50) is further divided into 25 equal zones (10x10). The average distance from the character centroid to the pixels present in the zone column, is computed. This procedure is sequentially repeated for all the zone/grid/box columns present in the zone (10 features). Similarly, the average distance from the character centroid to the pixels present in the zone row is computed. This procedure is sequentially repeated for all the zone rows present in the zone (10 features). There could be some zone column/row that is empty of foreground pixels; then the feature value of that zone column/row in the feature vector is zero. This procedure is sequentially repeated for the entire zone present in the numeral image. Finally, 500 such features are extracted for classification and recognition. The nearest neighbor, feed forward back propagation neural network and support vector machine classifiers are used for subsequent classification and recognition purposes. We obtained a recognition rate of 98.65 % for Kannada numerals, 96.1 % for Tamil numerals, 98.6 % for Telugu numerals and 96.5 % for Malayalam numerals using the support vector machine.
关键词：Handwriting Recognition;Extraction Algorithms;Indian Scripts