文章基本信息

标题：Code Transformations to Improve Memory Parallelism
本地全文：下载
作者：Vijay S. Pai ; Sarita Adve
期刊名称：The Journal of Instruction-Level Parallelism
电子版ISSN：1942-9525
出版年度：2000
卷号：2
出版社：International Symposium on Microarchitecture
摘要：Current microprocessors incorporate techniques to exploit instruction-level parallelism (ILP). However,previous work has shown that these ILP techniques are less effective in removing memory stall time thanCPU time, making the memory system a greater bottleneck in ILP-based systems than in previous-generationsystems. These deficiencies arise largely because applications present limited opportunities for an out-of-order issue processor to overlap multiple read misses, the dominant source of memory stalls.This work proposes code transformations to increase parallelism in the memory system by overlappingmultiple read misses within the same instruction window, while preserving cache locality. We present ananalysis and transformation framework suitable for compiler implementation. Our simulation experimentsshow execution time reductions averaging 20% in a multiprocessor and 30% in a uniprocessor. A substantialpart of these reductions comes from increases in memory parallelism. We see similar benefits on a ConvexExemplar.
关键词：compiler transformations; out-of-order issue; memory parallelism; latency tolerance; unroll-;and-jam