I’m currently working on a thesis concerning alternative parallelization strategies, which entails evaluation of and comparison to existing solutions (such as OpenACC), mainly using matrix multiplication. Since I do not personally possess any significant experience in working with OpenACC, I fear that my implementation using it would be at a disadvantage, which is undesirable.
I therefore would like to ask how a good and “fair” implementation of matrix multiplication using OpenACC (and the PGI compilers) would look like.
Thanks very much in advance!
We ship an example OpenACC matrix multiply code with the compilers. See “$PGI/linux86-64/2019/examples/OpenACC/SDK/src/matrixMul”.
If you’re looking for a more robust set of applications, you might consider requesting a license for the SPEC ACCEL benchmark suite (https://www.spec.org/accel/). I’m part of the SPEC/HPG committee that developed the suite and we recently made the license free for qualifying non-commercial use, including academic research. There are OpenCL, OpenACC, and OpenMP suites, with the OpenACC and OpenMP suite sharing the same codes, just with different directives. OpenCL is mostly different codes, but do share a few.