If I remember correctly, I had to manually fix some of the broken CUDA symlinks in the /opt/nvidia/hpc_sdk/Linux_<platform>/<version>/cuda directory. This is true if you didn’t install the multi-CUDA version. Can somebody from NVIDIA verify this?
The compiler will default to use the CUDA version of the installed CUDA driver. The cuda sub-option is only needed if you want to use a different CUDA version then the default.
What CUDA driver do you have installed and what “cuda” option are you using when it compiles successfully?
The error is a code generation issue so wouldn’t expect it to matter which CUDA version you’re using, but possibly. Though as you know, we run POT3D in our daily performance testing and we’ve not seen any issues nor we don’t use -gpu=cudaX.Y. Is this a different version then what we have?
My system has the CUDA driver 11.2 installed (the most recent one that the “cuda” package in Ubuntu 20.04 installs).
I had thought the compiler would default to the most recent CUDA included in the NV compiler package, but it does make sense to try to sync it with the driver version.
However, since the CUDA libraries NV is packaged with often (or always) are “behind” the most recent CUDA driver release, maybe there could be a catch for this issue so that if the driver version is not included in the NV compiler, it just uses the most recent one it has?
I’ve tried my best to replicate this on a system with a 11.2 CUDA driver using a fresh install of 20.11, but no luck. POT3D successfully compiles for me. So unfortunately, I’m not sure what’s wrong. Is this the same version of POT3D that I have?
It is basically the same version (just in our old fixed format).
I am compiling on a laptop with a GTX 1060 with Optimus after loading the gpu (although I am not sure why this would make a difference).
When I compile with the CUDA version specified,everything works fine, so it’s not a big deal.
If I find another system where this happens, I will let you know.