期刊名称:ELCVIA: electronic letters on computer vision and image analysis
印刷版ISSN:1577-5097
出版年度:2013
卷号:12
期号:1
页码:1-16
DOI:10.5565/rev/elcvia.508
语种:English
出版社:Centre de Visió per Computador
摘要:This paper provides an extensive analysis concerning runtime, accuracy and noise of High-Performance Computing (HPC) frameworks for Computed Tomography (CT) reconstruction tasks: "conventional" multi-core, multi threaded CPUs, the Compute Unified Device Architecture (CUDA) on GPUs, and the graphics pipeline of GPUs as facilitated by the DirectX or OpenGL programming interfaces, exploiting various built-in hardwired features like rasterization and texture filtering. We compare implementations of the Filtered Back-Projection (FBP) algorithm with fan-beam geometry on all these HPC frameworks. Specifically, an ACR-accredited phantom is reconstructed from the raw attenuation data acquired by a clinical CT scanner. Our analysis shows that a single GPU can run the FBP algorithm for reconstructing a 1024 x 1024 image considerably faster than a 64-core, multi-threaded CPU machine. Moreover, employing the graphics pipeline further increases performance as compared to CUDA, albeit with slightly lower accuracy due to "fast math" operations.
其他摘要:This paper provides an extensive analysis concerning runtime, accuracy and noise of High-Performance Computing (HPC) frameworks for Computed Tomography (CT) reconstruction tasks: "conventional" multi-core, multi threaded CPUs, the Compute Unified Device Architecture (CUDA) on GPUs, and the graphics pipeline of GPUs as facilitated by the DirectX or OpenGL programming interfaces, exploiting various built-in hardwired features like rasterization and texture filtering. We compare implementations of the Filtered Back-Projection (FBP) algorithm with fan-beam geometry on all these HPC frameworks. Specifically, an ACR-accredited phantom is reconstructed from the raw attenuation data acquired by a clinical CT scanner. Our analysis shows that a single GPU can run the FBP algorithm for reconstructing a 1024 x 1024 image considerably faster than a 64-core, multi-threaded CPU machine. Moreover, employing the graphics pipeline further increases performance as compared to CUDA, albeit with slightly lower accuracy due to "fast math" operations.