Nvcc lower version than CUDA causes compiled code runtime error 300

The nvcc version does not match the CUDA version on the PC.
The highest arch I can set on nvcc is sm_75. When I compile with it, I get error 300 (running on the same PC). I tried other arch values and got the same error.

From what I saw online I should compile with sm_80 and higher for the GPU and CUDA version on this PC.

I have tried a sample code I got here
When tried it on a different PC where the nvcc version matches the CUDA, it ran successfully.

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

From nvcc --help:

Allowed values for this option:  'compute_30','compute_32','compute_35',
        'compute_37','compute_50','compute_52','compute_53','compute_60','compute_61',
        'compute_62','compute_70','compute_72','compute_75','sm_30','sm_32','sm_35',
        'sm_37','sm_50','sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72',
        'sm_75'.
$ nvidia-smi 
Tue Sep 24 16:58:09 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06             Driver Version: 535.183.06   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A4500 Laptop GPU    On  | 00000000:01:00.0 Off |                  Off |
| N/A   54C    P8              15W /  90W |    142MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Compilation:

nvcc -arch=sm_75 -Xptxas="-v" --cubin qkernel.cu 
qkernel.cu(17): warning: variable "x" was declared but never referenced

qkernel.cu(18): warning: variable "y" was declared but never referenced

ptxas info    : 0 bytes gmem, 1 bytes cmem[3]
ptxas info    : Compiling entry function '_Z6kernelPhPKf' for 'sm_75'
ptxas info    : Function properties for _Z6kernelPhPKf
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 14 registers, 368 bytes cmem[0]
$ nvcc -o qexe qmain.cc -lcuda
qmain.cc: In function ‘int main()’:
qmain.cc:28:20: warning: ‘CUresult cuCtxDetach(CUcontext)’ is deprecated [-Wdeprecated-declarations]
     cuCtxDetach(ctx);
                    ^
In file included from qmain.cc:1:
/usr/include/cuda.h:4125:36: note: declared here
 __CUDA_DEPRECATED CUresult CUDAAPI cuCtxDetach(CUcontext ctx);
                                    ^~~~~~~~~~~
qmain.cc:28:20: warning: ‘CUresult cuCtxDetach(CUcontext)’ is deprecated [-Wdeprecated-declarations]
     cuCtxDetach(ctx);
                    ^
In file included from qmain.cc:1:
/usr/include/cuda.h:4125:36: note: declared here
 __CUDA_DEPRECATED CUresult CUDAAPI cuCtxDetach(CUcontext ctx);

Running:

$ ./qexe 
Success: qmain.cc@18
Success: qmain.cc@20
Success: qmain.cc@22
Error: 300 qmain.cc@26

Should I upgrade the nvcc? How do I do that?
Thanks,
Adi

The binary (SASS) compiled for cc7.5 will not run on your cc8.x GPU. I’m not sure why you are using the compilation sequence, but if you intend to use that sequence (compiling to binary, as opposed to PTX, which is what -arch=sm_75 would normally do with the runtime API usage) then you will need to update your CUDA install.

It should certainly be possible to compile e.g. the vectorAdd sample code on your machine using -arch=sm_75, and run it on your cc8.x GPU.

Updating your CUDA install can be done using the assets here and following the instructions in the CUDA linux install guide.

1 Like

Thank you for your response.
I’m trying to use the binary method (cubin) and not the PTX, to avoid compilation on runtime, as we are having issues with our application taking a long time to open the video, and we suspect it has to do with the ptx loading.
I’ll be more specific - The issue occurs on the first time running our app after upgrading the Nvidia driver on a pc (currently upgrading to 560)

I’ll try updating the cuda as you suggested and check the sample code as well. thanks.

Then you must update your CUDA install to be able to target the GPU directly.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.