期刊名称:International Journal of Grid and Distributed Computing
印刷版ISSN:2005-4262
出版年度:2016
卷号:9
期号:9
页码:403-418
DOI:10.14257/ijgdc.2016.9.9.34
出版社:SERSC
摘要:With the rapid progress of computational science and computer simulation ability, a lot of properties can be predicted by the powerful ability of parallel computation before the actual research and development. With the development of high performance computer architecture, GPU is more and more widely used in high performance computation field as an emerging architecture, and a growing number of computations use GPU heterogeneous cluster architecture. However, how to partition workload and map to computing resource has always been the focus and difficult point. In the current study of GPU, according to the problems of the computing power provided by each node and the cluster hardware architecture which the application programmers don't understand, some partitioning strategies will result in serious load imbalance problem. Aimed at the complexity brought by the different computing ability of the nodes of GPU clusters, this paper proposes a GPU data partitioning strategy of heterogeneous clusters based on learning. It collects the states of each node in the process of running a program, and then estimates the calculation ability of each node dynamically, so as to guide the data partitioning. Actual testing results show that, this strategy allocates different tasks to nodes based on computing ability to ensure load balancing among nodes, so as to improve the execution performance of CUDA programs on heterogeneous GPU clusters and it laid a solid foundation for efficient computing on heterogeneous GPU clusters.
关键词:GPU clusters; Load balancing; Data partitioning; Learning strategy