其他摘要:In this work Valgrind's Tool Suite is used to profile memory usage (Cachegrind) and to graph function calls (Callgrind) when running Code Saturne with OpenMP. The object of this analysis is to detect possible code's bottlenecks in order to improve performance in shared memory environments. Additionally, tests were run using flat Message Passing Interface (MPI) to compare total memory usage between these parallelization strategies. Finally, the code is compiled using Profile-Guided Optimization (PGO) with a representative set of workloads to evaluate if this technique improves application's performance.