3 Versatile OpenACC Interoperability Techniques

Originally published at: https://developer.nvidia.com/blog/3-versatile-openacc-interoperability-techniques/

OpenACC is a high-level programming model for accelerating applications with GPUs and other devices using compiler directives compiler directives to specify loops and regions of code in standard C, C++ and Fortran to offload from a host CPU to an attached accelerator. OpenACC simplifies accelerating applications with GPUs. OpenACC tutorial: Three Steps to More Science An often-overlooked feature…

FWIW: "... using compiler directives compiler directives to specify ..."
It happens.

Hey Jeff,

When I try to build the sources that you provided in the Github, the first few of them are built without any errors. But, for "openacc_cuda_device", I get following error:

pgc++ -o openacc_cuda_device -fast -acc -ta=nvidia:rdc -Minfo=accel saxpy_cuda_device.o openacc_cuda_device.o -Mcuda
nvlink fatal : Input file 'saxpy_cuda_device.o' newer than toolkit
pgacclnk: child process exit status 2: /opt/pgi/linux86-64/16.5/bin/pgnvd

What do you think? What could be wrong?
nvcc version: Cuda compilation tools, release 7.5, V7.5.17
pgi version: pgcc 16.5-0 64-bit target on x86-64 Linux -tp nehalem


I'm seeing the same thing. In my case, I was able to build by changing -Mcuda to -Mcuda=7.5.

Thanks Jeff.
Setting Mcuda to 7.5 makes the error go away. I didn't know you could assign Mcuda to 7.5. PGI compiler doesn't say anything about this in there HELP output.

Let me add here a method to call CUDA device functions from OpenACC kernels:

Nice technique! Thanks for sharing it Dmitry.