期刊名称:International Journal of Database Management Systems
印刷版ISSN:0975-5985
电子版ISSN:0975-5705
出版年度:2012
卷号:4
期号:5
DOI:10.5121/ijdms.2012.4509
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:The Data Warehouse Striping (DWS) technique is a data partitioning approach especially designed for distributed data warehousing environments. In DWS the fact tables are distributed by an arbitrary number of low-cost computers and each query is executed in parallel by all the computers, guarantying a nearly optimal speed up and scale up. Data loading in distributed data warehouses is typically a heavy process and brings the need for loading algorithms that conciliate a balanced distribution of data among nodes with an efficient data allocation. These are fundamental aspects to achieve low and uniform response times and, consequently, high performance during the execution of queries. This paper proposes a generic approach for the evaluation of data distribution algorithms and assesses several alternative algorithms in the context of DWS. The experimental results show that the effective loading of the nodes must consider complementary effects, minimizing the number of distinct keys of any large dimension in the fact tables in each node, as well as splitting correlated rows among the nodes.
关键词:Data distribution; Data striping; Data warehousing; Performance