文章基本信息

标题：A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
本地全文：下载
作者：Joseph D. Garvey ; Tarek S. Abdelrahman
期刊名称：Scientific Programming
印刷版ISSN：1058-9244
出版年度：2018
卷号：2018
DOI：10.1155/2018/6093054
出版社：Hindawi Publishing Corporation
摘要：We propose and evaluate a novel strategy for tuning the performance of a class of stencil computations on Graphics Processing Units. The strategy uses a machine learning model to predict the optimal way to load data from memory followed by a heuristic that divides other optimizations into groups and exhaustively explores one group at a time. We use a set of 104 synthetic OpenCL stencil benchmarks that are representative of many real stencil computations. We first demonstrate the need for auto-tuning by showing that the optimization space is sufficiently complex that simple approaches to determining a high-performing configuration fail. We then demonstrate the effectiveness of our approach on NVIDIA and AMD GPUs. Relative to a random sampling of the space, we find configurations that are 12%/32% faster on the NVIDIA/AMD platform in 71% and 4% less time, respectively. Relative to an expert search, we achieve 5% and 9% better performance on the two platforms in 89% and 76% less time. We also evaluate our strategy for different stencil computational intensities, varying array sizes and shapes, and in combination with expert search.