首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:Near-Real-Time Parallel ETL+Q for Automatic Scalability in Bigdata
  • 本地全文:下载
  • 作者:Pedro Martins ; Maryam Abbasi ; Pedro Furtado
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2016
  • 卷号:6
  • 期号:1
  • 页码:201-218
  • DOI:10.5121/csit.2016.60118
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:In this paper we investigate the problem of providing scalability to near-real-time ETL+Q(Extract, transform, load and querying) process of data warehouses. In general, data loading,transformation and integration are heavy tasks that are performed only periodically duringsmall fixed time windows.We propose an approach to enable the automatic scalability and freshness of any datawarehouse and ETL+Q process for near-real-time BigData scenarios. A general framework fortesting the proposed system was implementing, supporting parallelization solutions for eachpart of the ETL+Q pipeline. The results show that the proposed system is capable of handlingscalability to provide the desired processing speed.
  • 关键词:Scalability; ETL; freshness; high-rate; performance; parallel processing; distributed systems;database; load-balance; algorithm
国家哲学社会科学文献中心版权所有