摘要:We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of
seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a
“memory intensive” problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was
about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed
that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU
case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones
was found to impose quite long time in data transfer between GPU and the host node. This problem was solved
by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using
120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to
achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore
conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as
a faster simulation is possible with reduced computational resources compared to CPUs.