期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2012
卷号:42
期号:2
页码:156-165
出版社:Journal of Theoretical and Applied
摘要:In recent years, Graphics Processing Units(GPUs) have attracted the attention of many application developers as powerful massively parallel system. Computer Unified Device Architecture (CUDA) as a general purpose parallel computing architecture makes GPUs an appealing choice to solve many complex computational problems in a more efficient way. Sparse Matrix-vector Multiplication(SpMV) algorithm is one of the most important scientific computing kernel algorithms. In this paper, we proposed new parallelization algorithms that CSR-M based on CSR format and ELLPACK-R based on ELLPACK format, which are realized the parallelism kernel on GPU with CUDA. We discussed implementing optimizing SpMV on GPUs using CUDA programming model, the optimization strategies including: mapping thread, mergering access, reusing data, avoiding branch, optimization thread block. The experiment results showed the proposed optimization strategies can improve performance, memory bandwidth and reduce the execution time of kernel.
关键词:Sparse Matrix-vector Multiplication; Computer Unified Device Architecture; Graphics Processing Unit