期刊名称:Advanced Computing : an International Journal
印刷版ISSN:2229-726X
电子版ISSN:2229-6727
出版年度:2011
卷号:2
期号:2
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Abstract: Problem statement: Historical documents such as old books and manuscripts have a high esthetic value and highly appreciated. Unfortunately, there are some documents cannot be read due to quality problems like faded paper, ink expand, uneven color tone, torn paper and other elements disruption such as the existence of small spots. The study aims to produce a copy of manuscript that shows clear wordings so they can easily be read and the copy can also be displayed for visitors. Approach: 16 samples of Jawi historical manuscript with different quality problems were obtained from The Royal Museum of Pahang, Malaysia. We applied three binarization techniques; Otsu’s method represents global threshold technique; Sauvola and Niblack method which are categorized as local threshold techniques. There are also a pre processing step involving histogram equalization process, morphology functions and filtering technique. Finally, we compare the binarized images with the original manuscript to be visually inspected by the museum’s curator. The unclear features were marked and analyzed. Results: Most of the examined images show that with optimal parameters and effective pre processing technique, local thresholding methods are work well compare with the other one. Even the global thresholding method give less cost in computational time, the results were not yet satisfied. Niblack’s and Sauvola’s techniques seem to be the suitable approaches for these types of images. Most of binarized images with these two methods show improvement for readability and character recognition. For this research, even the differences of image result were hard to be distinguished by human capabilities, after comparing the time cost and overall achievement rate of recognized symbols, Niblack’s method is performing better than Sauvola’s. Conclusion: There is no single algorithm that works well for all types of images but some work well than others for particular types of images suggesting that improved performance can be obtained by automatic selection or combination of appropriate algorithms for the type of document image under investigation. We could improve the post processing step by adding edge detection techniques and further enhanced by an innovative image refinement technique and a formulation of a class proper method.