I am trying to accelerate some of the loops in my code. I have inserted the #pragma before the loop and the code compiles fine using:
pgc++ -fast -Minfo=all,ccff -ta=tesla:cc5 *.cpp
I have a GTX960M so the compile flags for the code to build are correct.
I have two accelerators available to me:
00:02.0 VGA compatible controller : Intel Corporation Skylake Integrated Graphics [8086:191b] (rev 06)
01:00.0 3D controller : NVIDIA Corporation GM107M [GeForce GTX 960M] [10de:139b] (rev a2)
When I run the executable I get the following error:
========= Program hit CUDA_ERROR_INVALID_DEVICE (error 101) due to “invalid device ordinal” on CUDA API call to cuDevicePrimaryCtxRetain.
========= Saved host backtrace up to driver entry point at error
I am only assuming that it is trying to run the code generated for the GTX 980M on the Integrated Graphics accelerator. How do I specify which accelerator to use? I assumed it would be done automatically but obviously not.
Many thanks for any insights?