出版社:International Institute for Science, Technology Education
摘要:Automated recognition of handwritten digits has applications in several industries such as Postal and Banking for reading of addressed packages and cheques respectively. This paper compares four machine learning classifiers namely Naive Bayes, Instance Based Learner, Decision Tree and Neural Network for single digit recognition. Our experiments were conducted using the WEKA machine learning tool on two datasets; the MNIST offline handwritten digits and a collection of online ISGL handwritten digits acquired with a pen digitiser. Experiments were designed to allow for comparison within the datasets in a cross validation and across them where the online dataset is used for training and the offline dataset for testing and vice versa. We also compared classification accuracy at different levels of down sampling. Results indicate that the lazy learning instance based classifier performed slightly better than the neural network with a maximal accuracy of 97.86% and they both outperformed the other two classifiers: Naive Bayes and Decision Tree. The decision tree gave the worst performance of the four classifiers. We also discovered that better results were obtained with using the online digits when tested in a cross validation experiment. However, the pre-processed MNIST offline digits gave higher accuracies when used for training and tested with the online ISGL digits not vice versa. Also, we discovered down sampled size of 14x14 gave the best results for most of the four classifiers although these were not significantly different from the other down sampled sizes of 7x7, 21x21 and 28x28. We intend to investigate the performance of these classifiers in recognition of other characters (alphabets, punctuation and other symbols) as well as extend the recognition task to other levels of text granularity such as words, sentences and paragraphs.
关键词:Digits recognition; machine learning; classifiers; handwritten character recognition; Weka