首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:An Efficient Algorithm to Find the Height of a Text Line and Overcome Overlapped and Broken Line Problem during Segmentation
  • 本地全文:下载
  • 作者:Sanjibani Sudha Pattanayak ; Sateesh Kumar Pradhan ; Ramesh Chandra Mallik
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2019
  • 卷号:10
  • 期号:12
  • 页码:537-541
  • 出版社:Science and Information Society (SAI)
  • 摘要:Line segmentation is a critical phase of the Optical Character Recognition (OCR) which separates the individual lines from the image documents. The accuracy rate of the OCR tool is directly proportional to the line segmentation accuracy followed by the word/character segmentation. In this context, an algorithm, named height_based_segmentation is proposed for the text line segmentation of printed Odia documents. The proposed algorithm finds the average height of a text line and it helps to minimize the overlapped text line cases. The algorithm also includes post-processing steps to combine the modifier zone with the base zone. The performance of the algorithm is evaluated through the ground truth and also by comparing it with the existing segmentation approaches.
  • 关键词:Document image analysis; line segmentation; word segmentation; database creation; printed Odia document
国家哲学社会科学文献中心版权所有