文章基本信息

标题：Elastic Performance For ETL+Q Processing
本地全文：下载
作者：Pedro Martins ; Maryam Abbasi ; Pedro Furtado 等
期刊名称：International Journal of Database Management Systems
印刷版ISSN：0975-5985
电子版ISSN：0975-5705
出版年度：2016
卷号：8
期号：1
页码：13
DOI：10.5121/ijdms.2016.8102
出版社：Academy & Industry Research Collaboration Center (AIRCC)
摘要：Most data warehouse deployments are not prepared to scale automatically, although some applicationshave large or increasing requirements concerning data volume, processing times, data rates, freshness andneed for fast responses. The solution is to use parallel architectures and mechanisms to speed-up dataintegration and to handle fresh data efficiently. Those parallel approaches should scale automatically. Inthis work, we investigate how to provide scalability and data freshness automatically, and how to managehigh-rate data efficiently in very large data warehouses. The framework proposed in this work handlesparallelization and scales of the data-warehouse when necessary. It does not only scale-out to increase theprocessing capacity, but it also scales in when resources are underused. In general, data freshness is alsonot guaranteed in those contexts, because data loading, transformation, and integration are heavy tasksthat are done only periodically, instead of row-by-row. The framework we propose is designed to providedata freshness as well.
关键词：Scalability; ETL; freshness; high-rate; performance; parallel processing; distributed systems; database;load-balance; algorithm