CentOS7 : OptiX error: A supported NVIDIA GPU could not be found

I’m trying to setup an OptiX environment under CentOS 7. I’ve got a GeForce RTX 2080 Ti and I installed the Nvidia driver (version 418.56). I also installed Cuda-toolkit-10.1 and OptiX-SKD-6.0.0.

When I run the Cuda samples, everything goes well (see code snippet 2), but when I run the OptiX samples I got the error : “OptiX error: A supported NVIDIA GPU could not be found”.

I know there is already a thread about this error. But for him the problem came from the driver version, he was under 418.40 and he solved the problem by installing the version 418.56. Me, I am already under the version 418.56, so I guess the problem does not come from the driver.

Also, I set the LD_LIBRARY_PATH to the OptiX libraries. And I tried with the CUDA_VISIBLE_DEVICES variable too.

Here is the nvidia-smi command output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.40.04    Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  On   | 00000000:65:00.0  On |                  N/A |
| 27%   40C    P8    13W / 250W |    416MiB / 10981MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

Here is the “deviceQuery” cuda sample output (only relevant info):

Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2080 Ti"
  CUDA Driver Version / Runtime Version          10.1 / 10.1
  CUDA Capability Major/Minor version number:    7.5
  .
  .
  .

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS

Hi ABardoux,

I’m not sure what’s wrong, but it looks like you tried to do the right things. Maybe the first thing to do is run one of the samples using strace and see which OptiX .so files it’s trying and failing to load. Also check which ones it does load, perhaps it’s picking one from somewhere you’re not expecting?


David.

Hi ABardoux,

I just noticed that while you have driver version 418.56, you still have NVIDIA-SMI version 418.40, so perhaps you did not completely uninstall the old driver before installing the new driver? I was only able to get things working on Ubuntu after uninstalling all NVIDIA software and starting fresh.

Hi,

Thanks for your feedback.

When I turned my computer on, all the graphical part was down. So I tried to re-install the driver, but I couldn’t because of a Linux header file conflict. I decided to uninstall the driver in order to re-install it from a clean base. It worked. But now I can’t re-install it, because the nvidia kernel modules are already loaded, and I can’t unload them. It seems it’s because the version 418.40 has installed (I don’t understand where it comes from), and everything is working (gnome, Cuda, OptiX).

So the problem was probably due to a conflict between the driver v418.56 and NVIDIA-SMi v418.40, has you noticed it nljones, or because there was an error I didn’t notice when I installed the driver version 418.56.

Anyway, is there a way to upgrade everything to the version 418.56 (or 418.74)?

Thanks,
Arnaud

You might find Ingo Wald’s notes on driver installation helpful: https://ingowald.blog/installing-the-latest-nvidia-driver-cuda-and-optix-on-linux-ubuntu-18-04/

Specifically, he uses run level 3 to install the drivers so that the kernel modules aren’t loaded and can be replaced.


David.