首页    期刊浏览 2025年08月28日 星期四
登录注册

文章基本信息

  • 标题:An Approach to Skew Detection of Printed Documents
  • 本地全文:下载
  • 作者:Darko Brodić ; Carlos A. B. Mello ; Čedomir A. Maluckov
  • 期刊名称:Journal of Universal Computer Science
  • 印刷版ISSN:0948-6968
  • 出版年度:2014
  • 卷号:20
  • 期号:4
  • 页码:488-506
  • 出版社:Graz University of Technology and Know-Center
  • 摘要:In this paper, we propose an approach to estimate the text skew for printed documents. This is an important step to prevent errors in further stages of an automatic document processing system (as text segmentation). Our approach is based on the statistical analysis of the height of the connected components. In a nutshell, our algorithm is comprised of four steps: (i) removal of redundant data; (ii) establishment of the connected components, which represent filled convex hulls around each text element; (iii) enlargement of these components using morphological erosion; (iv) removal of the largest connected component to identify the first estimation of text skew. According to it, the connected components are enlarged by oriented morphological erosion and the longest of them is extracted. Statistical moments are applied to this longest component to evaluate its orientation and the global text skew of the document is identified. At the end of this process, the original document is rotated back based on the calculated angle. The performance of the proposed algorithm is examined by testing on a custom dataset. The results support the robustness of our approach.
国家哲学社会科学文献中心版权所有