Error Message "This file was compiled: -ta=tesla:cc35"

Hi Mat, I would appreciate your help with the error message I get “This file was compiled: -ta=tesla:cc35”. I’m using a Tesla K40c GPU, downloaded and installed the driver 460.106.00 in an Ubuntu 18.04. The Makefile has the following options:

FORTRAN = nvfortran
LOADER = nvfortran
LOADOPTS = -O4 -fast -mcmodel=medium -Mlarge_arrays -Minfo=accel -Mcuda=lineinfo -Minline -o exe
LIBRARY = -L /opt/nvidia/hpc_sdk/Linux_x86_64/21.2/cuda/11.2/lib64 -lcudart -lblas

Thank you in advance.

Nikos
ps. I would also appreciate if you dm me, we 're facing an nvidia-related-issue at the lab (Technical University of Crete) and we would like to ask for your help.

Hi Nikos,

This is usually a driver issue. If the runtime can’t load the CUDA driver (libcuda.so) it tries to fallback to run the host code, but since it’s compiled to only target a CC35 device, the execution fails.

What’s the output from the utility “nvaccelinfo” and “nvidia-smi”? Are you able to run a CUDA C example code?

Now the K40s were deprecated in CUDA 11.0 but I’ve not seen anything that indicates that support was dropped in later 11.x releases, but it’s possible that’s what’s going on. The K40 systems that I have access to all have either CUDA 10.2 or 11.0 drivers, so cant’ check.

ps. I would also appreciate if you dm me, we 're facing an nvidia-related-issue at the lab (Technical University of Crete) and we would like to ask for your help.

Sure, though if it’s a hardware issue, I may not be the best to advise. Though I can ask who’s the SA for Greece and they may be better able to assist.

-Mat

Hi again Mat,

I can’ t run a sample code (acc_f1.f90) from the hpc_sdk sample folder. I get a
/usr/bin/ld: cannot open output file acc_f1.out: Permission denied
pgacclnk: child process exit status 1: /usr/bin/ld

message. Moreover,

  • nvaccelinfo responds with
    CUDA Driver Version: 11040
    could not initialize CUDA runtime, error code=100
    No accelerators found.
    Check the permissions on your CUDA device

  • and nvidia-smi with
    NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

(lspci -v indicates there’s GK110bl nvidia 3D controller attached though)

Ok, it’s definitely a CUDA driver issue. You might try re-installing or use a different CUDA driver versions. According to the following doc, Kepler wasn’t dropped until the 495 drivers, so I’d think 460 would be ok, but this’s out of my area of expertise so I don’t know for sure.

https://docs.nvidia.com/deploy/cuda-compatibility/index.html#faq

As for the permission issue on the example files, I’d copy these files to a directory that you have write permission so you can create the binary. Granted they’ll fail to run due to the driver issue.

Hi again Mat and thanks for your help,

had to roll back to 410 drivers, not sure if this is the best way to work things out but at least it’s working.

Thanks again,
Nikos