期刊名称:International Journal of Database Management Systems
印刷版ISSN:0975-5985
电子版ISSN:0975-5705
出版年度:2014
卷号:6
期号:2
页码:67
DOI:10.5121/ijdms.2014.6205
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Data Warehouses store integrated and consistent data in a subject-oriented data repository dedicatedespecially to support business intelligence processes. However, keeping these repositories updated usuallyinvolves complex and time-consuming processes, commonly denominated as Extract-Transform-Load tasks.These data intensive tasks normally execute in a limited time window and their computational requirementstend to grow in time as more data is dealt with. Therefore, we believe that a grid environment could suitrather well as support for the backbone of the technical infrastructure with the clear financial advantage ofusing already acquired desktop computers normally present in the organization. This article proposes adifferent approach to deal with the distribution of ETL processes in a grid environment, taking into accountnot only the processing performance of its nodes but also the existing bandwidth to estimate the gridavailability in a near future and therefore optimize workflow distribution.