期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2013
卷号:52
期号:2
出版社:Journal of Theoretical and Applied
摘要:Optical character recognition (OCR) is one of the most important fields in pattern recognition world which is able to recognize handwritten characters, irregular characters and machine printed characters. Optical character recognition system consists of five major tasks which are involved pre-processing, segmentation, feature extraction, classification and recognition. Generally, less discriminative features in global feature approach leads to reduce in recognition rate. By proposing a global approach that produces more discriminative features and less dimensionality of data, these problems are overcome. Two feature extraction methods are studied namely Gray Level Co-occurrence Matrix (GLCM) and edge direction matrix (EDMS) and combination of two popular feature extraction methods is proposed. The most important problem of EDMS is the number of produced features with this method which is just 18 features and is not enough for feature extraction purpose and it causes reducing the recognition rate. The aim of this research is improving the recognition rate of EDMS by combining with a global feature extraction method in order to increase the number of extracted feature and produce better recognition rate. The proposed method is a combination of GLCM and EDMS method with and without feature selection method called gain ratio and ranker search which applied to reduce the dimensionality of data. They have been tested onto four different datasets involving manually and automatically cropped licence plate and font style images amounting 3520 images from 0 to 9 and capital letter A to Z with various size and shape. Another dataset is large binary images of shapes which involve 300 images of objects. Then, gain ratio and ranker search are used to select discriminative features whereby the features reduced from 58 to 34 numbers of features. The proposed combinatory method, EDMS and GLCM methods are classified using neural network, Bayes network and decision tree. The experimental results for character recognition indicate that the proposed combinatorial method obtain better average accuracy rate about 85.99% whereas EDMS, GLCM and combination without feature selection achieved 80.19%, 38.84%, and 58.78% subsequently and the experimental results for object recognition indicates that proposed method before and after feature selection outperformed other methods such as EDMS and GLCM with 90.83% and 92.5% accuracy rate respectively with NN as the classifier. Consequently global and spatial approaches are compared in recognition of objects and characters. The experimental results show the better performance of proposed method as a global feature extraction method for object recognition purpose with 92.5% accuracy rate with NN while Robinson filter as a spatial feature extraction method outperformed the global feature extraction methods for character recognition purpose with 100% accuracy rate with NN. Also global approach obtains smaller processing time comparing to spatial approach.
关键词:Feature extraction; Character Recognition; Image Processing; OCR; Global Feature Extraction