首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Heuristic Algorithm for Automatic Extraction Relational Data from Spreadsheet Hierarchical Tables
  • 本地全文:下载
  • 作者:Arwa Awad ; Rania Elgohary ; Ibrahim Moawad
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2021
  • 卷号:12
  • 期号:10
  • DOI:10.14569/IJACSA.2021.0121082
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:Spreadsheets are contained critical information on various topics and were most broadly utilized in numerous spaces. There are a huge amount of spreadsheet clients everywhere in the world. Spreadsheets provide considerable flexibility for data structure organization. As well as it gives their makers an enormous level of opportunity to encode their data as it is simple to utilize and easy to store the data in a table format. Because of this flexibility, tables with very complex and hierarchical data structures could be generated. Thusly, such complexity makes table processing and reusing this data is a difficult task. Therefore, the expansion in volume and complexity of these tables has prompted the necessity to preserve this data and reuse it. As a result, this paper implemented a novel algorithm-based heuristic technique and cell classification strategy to automate relational data extraction from spreadsheet hierarchical tables and without need any programming language experience. Finally, the paper does experiments on 2 different real public datasets. The percentage of average accuracy using the proposed approach on the two datasets is 95 % and 94.2% respectively.
  • 关键词:Spreadsheet table analysis; hierarchal table structure; cell classification; heuristic algorithm; relational data extraction
国家哲学社会科学文献中心版权所有