首页    期刊浏览 2024年09月30日 星期一
登录注册

文章基本信息

  • 标题:AN UNSUPERVISED CLASSIFICATION TECHNIQUE FOR RECOGNITION OF SCRATCHED AND NON-SCRATCHED WORDS IN PRE-PRINTED DOCUMENTS
  • 本地全文:下载
  • 作者:N. SHOBHA RANI ; VASUDEV T ; VINEETH .P
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2016
  • 卷号:86
  • 期号:2
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Pre-processing of document images is the most variant factor from one type of document image to another. In general, especially document images require more intensive pre-processing procedures than other type of images; one of such categories is pre-printed form images. Pre-processing of such documents is different from other type of images containing simple text and free from graphical components. This paper proposes a generic pre-processing algorithm adaptable for pre-printed application form images. The work supports specifically on problem of detection and removal of scratched words inherent in the text, since these elements are interpreted neither by humans nor by machines. The algorithm exploits the features like Euler�s number, number of connected components and area covered by holes with in a text block for detection of scratched out text blocks. The algorithm has yielded reasonably good results with an overall efficacy of around 96.5%.
  • 关键词:Irrelevant Information; scratched words; non-scratched words; Morphological Operations; Pre-printed forms; unsupervised learning.
国家哲学社会科学文献中心版权所有