OpenACC Sample Codes

Hello,
I tested the sample codes such as acc_c3 using both the C and Fortran compilers… on two laptop GPUs 920m and 960m and one Tesla K80 GPU… And in all cases the serial time is faster than the optimized time…
Note that pgaccelinfo indicates the correct information of the used GPU…
I do not know what is the problem?
The optimized times should have been the faster ones…
Did anyone face a similar issue?

Best Regards

Hi Mohbay,

Those are just very simple examples and not meant to show performance speed-up. So your experience is expected.

As you port codes with more parallelism, more compute, and larger workloads, you see speed-up versus the CPU.

-Mat

Hi Mat,

Thanks a lot.

Best Regards