Hi,
I’m testing nvc++ with various combinations of execution policy and -stdpar …
includes="$NVCOMPILERS/$NVARCH/20.5/compilers/include-stdpar"
compile="nvc++ -Wall -fast -I $includes"
$compile -o main_no_policy main.cpp
$compile -DPOLICY=std::execution::seq -o main_seq main.cpp
$compile -DPOLICY=std::execution::par_unseq -o main_par_unseq main.cpp
$compile -stdpar -DPOLICY=std::execution::par_unseq -o main_stdpar_unseq main.cpp
echo "Running serially (no policy)" && ./main_no_policy
echo "Running sequentially" && ./main_seq
echo "Running in parallel unseq without GPU acceleration" && ./main_par_unseq
echo "Running in parallel unseq with GPU acceleration" && ./main_stdpar_unseq
=======================================
Testing nvc++
=======================================
Running serially (no policy)
Elapsed time in nanoseconds : 4018807479 ns
Elapsed time in microseconds : 4018807 µs
Elapsed time in milliseconds : 4018 ms
Elapsed time in seconds : 4 sec
Running sequentially
Elapsed time in nanoseconds : 4005378936 ns
Elapsed time in microseconds : 4005378 µs
Elapsed time in milliseconds : 4005 ms
Elapsed time in seconds : 4 sec
Running in parallel unseq without GPU acceleration
Elapsed time in nanoseconds : 4005979476 ns
Elapsed time in microseconds : 4005979 µs
Elapsed time in milliseconds : 4005 ms
Elapsed time in seconds : 4 sec
Running in parallel unseq with GPU acceleration
Elapsed time in nanoseconds : 196931098 ns
Elapsed time in microseconds : 196931 µs
Elapsed time in milliseconds : 196 ms
Elapsed time in seconds : 0 sec
So if the execution policy is std::execution::par_unseq then the compiler will execute this sequentially / serially on the CPU unless -stdpar is specified in which case it correctly executes this on the GPU.
g++ parallelises the same code across the CPU cores without any problem.
Thanks,
Leigh.