首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Arabic Document Classification by Deep Learning
  • 本地全文:下载
  • 作者:Taghreed Alghamdi ; Samia Snoussi ; Lobna Hsairi
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2021
  • 卷号:12
  • 期号:10
  • DOI:10.14569/IJACSA.2021.0121034
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:In this paper, we show how to classify Arabic document images using a convolutional neural network, which is one of the most common supervised deep learning algorithms. The main goal of using deep learning is its ability to automatically extract useful features from images, which eliminates the need for a manual feature extraction process. Convolutional neural networks can extract features from images through a convolution process involving various filters. We collected a variety of Arabic document images from various sources and passed them into a convolutional neural network classifier. We adopt a VGG16 pre-trained network trained on ImageNet to classify the dataset of four classes as handwritten, historical, printed, and signboard. For the document image classification, we used VGG16 convolutional layers, ran the dataset through them, and then trained a classifier on top of it. We extract features by fixing the pre-trained network's convolutional layers, then adding the fully connected layers and training them on the dataset. We update the network with the addition of dropout by adding after each max-pooling layer and to the fourteen and the seventeenth layers which are the fully connected layers. The proposed approach achieved a classification accuracy of 92%.
  • 关键词:Arabic document; document classification; deep learning; convolutional neural network (CNN); pre_trained network
国家哲学社会科学文献中心版权所有