期刊名称:Journal of King Saud University @?C Computer and Information Sciences
印刷版ISSN:1319-1578
出版年度:2022
卷号:34
期号:6
页码:3864-3877
DOI:10.1016/j.jksuci.2020.05.011
语种:English
出版社:Elsevier
摘要:In modern NUMA systems, increasing number of cores lead to a heavy congestion on shared caches and memory controllers. Therefore, in this paper, we propose a method (RCLB) which is capable of reducing the memory congestion with reconcile locality of communication. It prevents the load imbalancing on memory controllers which identifies the data traffic applying through clusters using MPI communication patterns. The proposed method works at the running time of MPI application with a small modification in the profiling technique. We have tested the kernels of NAS parallel benchmarks and found it as state-of-art. The experimental results show that the proposed method has achieved a bandwidth improvement ranges 12.65–23% in memory-based tests (CLB).
关键词:High-performance computing;MPI process placement;Memory congestion;NUMA architecture;Multi-cores;MPI-Rroutines