Hello,
Following the suggestion of Cannot dynamically load a shared library containing both OpenACC and CUDA code - #2 by MatColgrove, I have been trying to use if target (nv::target::is_device)
statements to guard a few parts of our code that are dynamically unreachable and cause problems when compiled for GPU. In general this has been working fine.
I have now come across a new failure mode, which can easily be reproduced with:
$ cat test.cpp
#include <nv/target>
int main() {
int x{};
#pragma acc kernels
{
if target (nv::target::is_device) {
x = 1;
}
}
return x;
}
$ nvc++ -V22.3 -acc -cuda -gpu=cuda11.6 test.cpp
$ nvc++ -V22.3 -acc -cuda -gpu=cuda11.6 -I/path/to/cuda-11.6.1-ngetva/include test.cpp
NVC++-S-1062-Support procedure called within a compute region - __builtin_current_device_sm (test.cpp: 6)
NVC++/x86-64 Linux 22.3-0: compilation completed with severe errors
I can’t reproduce with CUDA 11.7.0 and NVHPC 22.5, but I’m not sure if that is just luck, or because this was fixed deliberately. Internally we are still using 22.3 because of some (presumed unrelated) issues with 22.5 and our code base.
I am not aware that we are doing anything wrong here, so it seems like a bug and I hope the reproducer is helpful.
Best, Olli
P.S. By way of background, this issue showed up in a Spack build of our application; Spack apparently implicitly includes extra -I...
arguments via its compiler wrappers. We have picked up various fixes and workarounds over the years as we have tried to support various PGI/NVHPC releases and make our builds more robust. Some notes:
- Historically we have had to use the explicit
-gpu=cudaX.Y
flag to avoid the default version being different on nodes with/without GPUs. - Spack does not know about the versions of CUDA bundled with NVHPC, so depending on CUDA brings an external installation into the picture. This may eventually be improved in Virtual CUDA by wyphan · Pull Request #30748 · spack/spack · GitHub. We can test if removing the explicit CUDA dependency is a viable workaround with recent NVHPC.
- Outside Spack, we have historically had to explicitly add an external version of CUDA (with exactly the same version.json as the bundled one) to avoid issues such as Nvcc only partially respects CUDA_HOME ("Input file newer than toolkit").