NV 20.11 compilation fails with default flags (need to specify cuda version)

caplanr · January 21, 2021, 8:10pm

Hi,

Normally when I compile my OpenACC code, I use:

-acc=gpu -gpu=cc##,cuda##.# -Minfo=accel

Today, I tried compiling in a more "default " way as:

-acc=gpu

When I try this, I get the following error:

nvvmCompileProgram error 9: NVVM_ERROR_COMPILATION.
Error: /tmp/pgacchGi2bva1W-aVx.gpu (51, 23): parse expected comma after load’s type
ptxas /tmp/pgacc3Gi2bLAfQVpGN.ptx, line 1; fatal : Missing .version directive at start of file ‘/tmp/pgacc3Gi2bLAfQVpGN.ptx’
ptxas fatal : Ptx assembly aborted due to errors
NVFORTRAN-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (pot3d.f: 8401)
NVFORTRAN/x86-64 Linux 20.11-0: compilation aborted
make: *** [Makefile:40: pot3d.o] Error 2

I also tried using “-gpu=cc##” and it still fails.

It seems the compile requires the CUDA version to be specified.

I had thought this was an optional flag - is it now required?

Ron

wyphan · January 21, 2021, 8:38pm

If I remember correctly, I had to manually fix some of the broken CUDA symlinks in the /opt/nvidia/hpc_sdk/Linux_<platform>/<version>/cuda directory. This is true if you didn’t install the multi-CUDA version. Can somebody from NVIDIA verify this?

MatColgrove · January 21, 2021, 9:11pm

The compiler will default to use the CUDA version of the installed CUDA driver. The cuda sub-option is only needed if you want to use a different CUDA version then the default.

What CUDA driver do you have installed and what “cuda” option are you using when it compiles successfully?

The error is a code generation issue so wouldn’t expect it to matter which CUDA version you’re using, but possibly. Though as you know, we run POT3D in our daily performance testing and we’ve not seen any issues nor we don’t use -gpu=cudaX.Y. Is this a different version then what we have?

MatColgrove · January 21, 2021, 9:12pm

Wyphan, can you give details about what you mean by having to fix broken CUDA symlinks? Is this something you reported?

wyphan · January 21, 2021, 9:16pm

Is this something you reported?

Not yet, should I start a new thread for this?

MatColgrove · January 21, 2021, 9:30pm

Yes, please since I believe it unrelated to Ron’s issue.

caplanr · January 21, 2021, 9:38pm

Hi,

My system has the CUDA driver 11.2 installed (the most recent one that the “cuda” package in Ubuntu 20.04 installs).

I had thought the compiler would default to the most recent CUDA included in the NV compiler package, but it does make sense to try to sync it with the driver version.

However, since the CUDA libraries NV is packaged with often (or always) are “behind” the most recent CUDA driver release, maybe there could be a catch for this issue so that if the driver version is not included in the NV compiler, it just uses the most recent one it has?

Ron

MatColgrove · January 21, 2021, 10:07pm

Correct, it would use the latest CUDA version installed if the CUDA driver is newer. Though, I’m still unclear why this would cause this error.

caplanr · January 21, 2021, 10:24pm

Hi,

I have the multi-CUDA version of the SDK installed if that helps?

I just re-tested after a reboot and I can confirm that this works:

-O3 -acc=gpu -gpu=cc60,cuda11.1 -Minfo=accel

and this causes the error:

-O3 -acc=gpu -Minfo=accel

This also works:

-O3 -acc=gpu -gpu=cuda11.1 -Minfo=accel

Ron

MatColgrove · January 22, 2021, 4:30pm

Hi Ron,

I’ve tried my best to replicate this on a system with a 11.2 CUDA driver using a fresh install of 20.11, but no luck. POT3D successfully compiles for me. So unfortunately, I’m not sure what’s wrong. Is this the same version of POT3D that I have?

-Mat

caplanr · January 22, 2021, 6:04pm

Hi,

It is basically the same version (just in our old fixed format).

I am compiling on a laptop with a GTX 1060 with Optimus after loading the gpu (although I am not sure why this would make a difference).

When I compile with the CUDA version specified,everything works fine, so it’s not a big deal.
If I find another system where this happens, I will let you know.

Ron

Topic		Replies	Views
NV 21.3 fails to compile my OpenACC code nvc, nvc++ and nvfortran	7	1299	June 4, 2021
Nvfortran -Mcuda replacement issues nvc, nvc++ and nvfortran	7	1470	June 24, 2021
gcc passing compiler options to nvcc release 8.0, V8.0.26 - cudafe died signal 11 CUDA Programming and Performance	9	1728	September 29, 2016
Nvcc error ACCESS_VIOLATION CUDA NVCC Compiler	0	816	January 4, 2023
Ubuntu 20.04, GCC 9.3, Cuda Toolkit 11.3 - not a supported combination? CUDA Programming and Performance	11	8626	November 4, 2021
NVCC fatal error, make: *** [cudaobj/Debug/fkt_alles_cuda.o] Error 1 - Solved CUDA Setup and Installation	4	1579	February 4, 2019
OptiX 7.3: nvrtcCompileProgram reports all errors with line number 1 OptiX cuda	8	995	June 14, 2022
CUDA 11.6.0 with gcc 11.2.1 fails to process system headers included by <functional> CUDA NVCC Compiler	4	6131	May 24, 2022
#error -- unsupported GNU version! gcc versions later than 8 are not supported! CUDA on Windows Subsystem for Linux cuda	4	12576	May 3, 2022
'cicc' compilation error and debug flag CUDA Programming and Performance	25	13922	May 23, 2023

NV 20.11 compilation fails with default flags (need to specify cuda version)

Related topics