Hello, when I try running the Optix SDK tests, all of them give mean error saying: “Optix: Unknown Error”.
This is on a node with three GTX 980s running CentOS.
Does anyone know what could be the issue, or if it’s possible to force Optix to give me a more detailed error message than just “Unknown Error”?
That could be many things, like display driver too old, CUDA devices not found, OpenGL driver not found or picked the incorrect one, possible X11 not running, etc.
CentOS is not listed as supported OS in the OptiX 4.1.1 release notes.
I’m not using Linux myself and can’t say if its generally not going to work on CentOS.
Please provide some more detailed information about the system configuration.
This is my minimum checklist for any OptiX problem reports.
OS version and bitness, installed GPU(s), NVIDIA display driver version, OptiX version (major.minor.micro), CUDA toolkit version used to compile PTX code.
Which OptiX “tests” do you mean exactly?
Assuming this is OptiX 4.1.1, and CUDA toolkit 8.0.
What does the CUDA toolkit device query example report about the CUDA devices on the system?
Do OpenGL programs run on the system and use the NVIDIA display driver?
If not only the examples without OpenGL interop would work.
Does it work when only using one GPU? Then two?
Thank you for getting back to me.
I know OPTIX seems to work on my local Linux installation, however, I am running Arch on that system. I’m going to install CentOS to a separate partition and see if it works there as well, if it does it may be a system configuration issue.
OS Version: CentOS 7 64 bit
GPUs: 3xGTX 980s 4GB
Display Driver Version: 375.26
Optix Version: Optix 4.1.0
CUDA version: Cuda 8 (Cuda 7 also does not work)
Under optix-4.1.0/SDK-precompiled-samples there are many different tests, such as optixHello, optixDeviceQuery, etc.
When I run optixDeviceQuery I get:
OptiX Error: ‘Unknown error
(/root/sw/wsapps/raytracing/rtsdk/rel4.1/samples_sdk/optixDeviceQuery/optixDeviceQuery.cpp:53)’, which at least is a little bit more of a unique error. Still, it doesn’t give me actual information.
Other OpenGL programs do run on the system and use the NVIDIA display driver, for example, glxspheres64 runs, as does glxgears.
Thanks for your help!
I asked a Linux expert and he’s running CentOS 7 with OptiX 4.0.0 and 4.1.1 public release and doesn’t see an issue. He didn’t have OptiX 4.1.0 installed at this time. I would generally recommend to use the newest available OpitX version of any major version as well.
A shot in the dark: It is possible that your display is not connected to the primary CUDA device, but that should not affect optixDeviceQuery.
It’s more possible that the NVIDIA driver is installed but not loaded, so you’re actually using the nouveau driver.
You can check which driver is loaded by running:
lsmod | grep nvidia
lsmod | grep nouveau
If nouveau is running you would need to blacklist it and reboot.
This is normally done by the driver, but it’s unclear if your setup differs.
The blacklist file is:
$ cat /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
generated by nvidia-installer
options nouveau modeset=0