期刊名称:International Journal of Distributed and Parallel Systems
印刷版ISSN:2229-3957
电子版ISSN:0976-9757
出版年度:2016
卷号:7
期号:5
页码:1
DOI:10.5121/ijdps.2016.7501
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:This paper studies the performance and energy consumption of several multi-core, multi-CPUs and manycorehardware platforms and software stacks for parallel programming. It uses the Multimedia MultiscaleParser (MMP), a computationally demanding image encoder application, which was ported to severalhardware and software parallel environments as a benchmark. Hardware-wise, the study assessesNVIDIA's Jetson TK1 development board, the Raspberry Pi 2, and a dual Intel Xeon E5-2620/v2 server, aswell as NVIDIA's discrete GPUs GTX 680, Titan Black Edition and GTX 750 Ti. The assessed parallelprogramming paradigms are OpenMP, Pthreads and CUDA, and a single-thread sequential version, allrunning in a Linux environment. While the CUDA-based implementation delivered the fastest execution, theJetson TK1 proved to be the most energy efficient platform, regardless of the used parallel software stack.Although it has the lowest power demand, the Raspberry Pi 2 energy efficiency is hindered by its lengthyexecution times, effectively consuming more energy than the Jetson TK1. Surprisingly, OpenMP deliveredtwice the performance of the Pthreads-based implementation, proving the maturity of the tools andlibraries supporting OpenMP.
关键词:CUDA; OpenMP; Pthreads; multi-core; many-core; high performance computing; energy consumption