首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Toward Semi-Supervised Graphical Object Detection in Document Images
  • 本地全文:下载
  • 作者:Goutham Kallempudi ; Khurram Azeem Hashmi ; Alain Pagani
  • 期刊名称:Future Internet
  • 电子版ISSN:1999-5903
  • 出版年度:2022
  • 卷号:14
  • 期号:6
  • 页码:176
  • DOI:10.3390/fi14060176
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, these models necessitate a substantial amount of labeled data for the training process. This paper presents an end-to-end semi-supervised framework for graphical object detection in scanned document images to address this limitation. Our method is based on a recently proposed Soft Teacher mechanism that examines the effects of small percentage-labeled data on the classification and localization of graphical objects. On both the PubLayNet and the IIIT-AR-13K datasets, the proposed approach outperforms the supervised models by a significant margin in all labeling ratios (1%, 5%, and 10%). Furthermore, the 10% PubLayNet Soft Teacher model improves the average precision of Table, Figure, and List by +5.4,+1.2, and +3.2 points, respectively, with a similar total mAP as the Faster-RCNN baseline. Moreover, our model trained on 10% of IIIT-AR-13K labeled data beats the previous fully supervised method +4.5 points.
国家哲学社会科学文献中心版权所有