首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:IMPROVING OCR BY EFFECTIVE PRE-PROCESSING AND SEGMENTATION FOR DEVANAGIRI SCRIPT:A QUANTIFIED STUDY
  • 本地全文:下载
  • 作者:Dr. DEEPA GUPTA ; LEEMA MADHU NAIR
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2013
  • 卷号:52
  • 期号:2
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Optical Character Recognition (OCR) system aims to convert optically scanned text image to a machine editable text form. Multiple approaches to preprocessing and segmentation exist for various scripts. However, only a restricted combination of the same has been experimented on Devanagari script. This paper proposes a study which aims to explore and bring out an alternative and efficient strategy of pre-processing and segmentation in handling OCR for Devanagari scripts. Efficiency evaluation of the proposed alternative has been undertaken by subjecting it to documents with varying degree of noise severity and border artifacts. The experimental results confirm our proposition to be superior approach over other conventional methodologies to OCR system implementation for Devanagari scripts. Also described is detailed approach to conventional pre-processing involved in initial stage of OCR, including noise removal techniques, along with the other conventional approaches to segmentation. The proposed alternative has been deployed to reach character and top character segmentation level.
  • 关键词:Optical Character Recognition(OCR); Pre-processing; segmentation; Morphological operators; Connected component; Projection profile; Noise removal; Devanagari
国家哲学社会科学文献中心版权所有