首页    期刊浏览 2025年03月14日 星期五
登录注册

文章基本信息

  • 标题:Optimizing SMT Processors for High Single-Thread Performance
  • 本地全文:下载
  • 作者:Gautham Thambidorai ; Donald Yeung ; Seungryul Choi
  • 期刊名称:The Journal of Instruction-Level Parallelism
  • 电子版ISSN:1942-9525
  • 出版年度:2003
  • 卷号:5
  • 页码:1-35
  • 出版社:International Symposium on Microarchitecture
  • 摘要:Simultaneous Multithreading (SMT) processors achieve high processor throughput atthe expense of single-thread performance. This paper investigates resource allocation poli-cies for SMT pro cessors that preserve, as much as possible, the single-thread performanceof designated "foreground" threads, while still permitting other "background" threads toshare resources. Since background threads onsuchanSMTmachinehaveanear-zeroper-formance impact on foreground threads, we refer to the background threads as transparentthreads. Transparent threads are ideal for performing low-priority or non-critical compu-tations, with applications in process scheduling, subordinate multithreading, and on-lineperformance monitoring.To realize transparent threads, we propose three mechanisms for maintaining the trans-parency of background threads: slot prioritization, background thread instruction-windowpartitioning, and background thread .ushing. In addition, we propose three mechanismsto boost background thread performance without sacrificing transparency: aggressive fetchpartitioning, foreground thread instruction-window partitioning, and foreground thread.ushing. We implement our mechanisms on a detailed simulator of an SMT pro cessor, andevaluate them using 8 benchmarks, including 7 from the SPEC CPU2000 suite. Our resultsshow when cache and branch predictor interference are factored out, background threadsintroduce less than 1% performance degradation on the foreground thread. Furthermore,maintaining the transparency of background threads reduces their throughput by only 23%relative to an equal priority scheme.To demonstrate the usefulness of transparent threads, we study Transparent Soft-ware Prefetching (TSP), an implementation of software data prefetching using transparentthreads. Due to its near-zero overhead, TSP enables prefetch instrumentation for all loadsin a program, eliminating the need for profiling. TSP, without any profile information,achieves a 9.41% gain across 6 SPEC benchmarks, whereas conventional software prefetch-ing guided by cache-miss profiles increases performance by only 2.47%.
国家哲学社会科学文献中心版权所有