期刊名称:International Journal of Grid and Distributed Computing
印刷版ISSN:2005-4262
出版年度:2015
卷号:8
期号:4
页码:249-256
DOI:10.14257/ijgdc.2015.8.4.24
出版社:SERSC
摘要:Today it is common for map-reduce programs to be created from higher-level programming systems, which is often an implementation of SQL. In this paper, we propose an optimized method of translating SQL query to map-reduce tasks, which can be performed more efficiently by distributed computing systems. Our method introduces an optimization strategy which is generally applied in the procedure of merging query-plan- tree, an intermediate data structure derived from SQL abstract-syntax-tree, and this mainly helps us to achieve a considerable reduction on the number of generated map- reduce tasks and efficiency improvement of map-reduce implementation. Experimental evaluations performed by means of TPC-H tests show that our method significantly outperforms Hive in terms of reliability and efficiency.