首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Accelerating Stencil Computation on GPGPU by Novel Mapping Method Between the Global Memory and the Shared Memory
  • 其他标题:Accelerating Stencil Computation on GPGPU by Novel Mapping Method Between the Global Memory and the Shared Memory
  • 作者:Mo, Tieqiang ; Li, Renfa
  • 期刊名称:COMPUTING AND INFORMATICS
  • 印刷版ISSN:1335-9150
  • 出版年度:2018
  • 卷号:37
  • 期号:3
  • 页码:533-552
  • 语种:English
  • 出版社:COMPUTING AND INFORMATICS
  • 摘要:Acceleration of stencil computation can be effectively improved by utilizing the memory resource. In this paper, in order to reduce the branch divergence of traditional mapping method between the global memory and the shared memory, we devise a new mapping mechanism in which the conditional statements loading the boundary stencil computation points in every XY-tile are removed by aligning ghost zone to reduce the synchronization overhead. In addition, we make full use of single XY-tile loaded into registers in every stencil computation point, common sub-expression elimination and software prefetching to reduce overhead. At last detailed performance evaluation demonstrates our optimized policies are close to optimal in terms of memory bandwidth utilization and achieve higher performance of stencil computation.
  • 其他摘要:Acceleration of stencil computation can be effectively improved by utilizing the memory resource. In this paper, in order to reduce the branch divergence of traditional mapping method between the global memory and the shared memory, we devise a new mapping mechanism in which the conditional statements loading the boundary stencil computation points in every XY-tile are removed by aligning ghost zone to reduce the synchronization overhead. In addition, we make full use of single XY-tile loaded into registers in every stencil computation point, common sub-expression elimination and software prefetching to reduce overhead. At last detailed performance evaluation demonstrates our optimized policies are close to optimal in terms of memory bandwidth utilization and achieve higher performance of stencil computation.
  • 关键词:Parallel and distrubuted computing;Memory mapping; GPGPU; stencil computation; ghost zones;65Y05
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有