Hi,
I tested the NGC Docker version of HPC SDK 20.11 + Ubuntu 20.04.
The results and speed were the same as OpenMP when the -stdpar=multicore option was used.
However, with -stdpar=gpu, the results were diverged.
According to the compilation information, the DO LOOP inside DO CONCURRENT is also parallelized and put on the GPU. Is there any way to prevent this?
Regards,
Con