期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2013
卷号:4
期号:6-1
页码:114-119
出版社:Seventh Sense Research Group
摘要:There are a large number of different approaches to recognize the scripts currently available in OCR System. In this report we look to identify the script of multilanguages. In the proposed script identification system, we have considered four Indian languages such as Hindi (Devanagari), Bangla, Telugu, Kannada. This system will let document images to accurate scan with higher accuracy. In this context, we modeled script identification of multilingual document using horizontal projection profile based analysis with head line features. A database of 450 text words of Hindi, 450 text words of Bangla, 450 text words of Telugu and 450 text words of Kannada are used for experimentation. The proposed system yields the 97.83 accuracy with four specified languages. Since script identification plays an important role in analyzing the printed documents.
关键词:OCR; Multi-script recognition; Binarization; Line Segmentation; Horizontal projection profile