Segmentation fault in Optix7.1 examples

Ubuntu 18.04
Driver Version: 450.51.06 CUDA Version: 11.0
Two GPUs: 1080 Ti, and a Titan X.

I have downloaded and successfully built the Optix 7.1 SDK samples, but they all segmentation fault when I try to run them. A specific example:

./optixHello 
Segmentation fault (core dumped)

Building these examples with debug info and running them in gdb gives me the following result:

Thread 1 "optixHello" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in ?? ()
#1  0x00007fffbaf83725 in ?? () from /usr/lib/x86_64-linux-gnu/libnvoptix.so.1
#2  0x000055555555cb41 in optixDeviceContextCreate (context=0x7fffffffd0f0, options=0x7fffffffd1e0, fromContext=0x0)
at /home/jeff/src/NVIDIA-OptiX-SDK-7.1.0-linux64-x86_64/include/optix_stubs.h:277
#3  main (argc=<optimized out>, argv=0x7fffffffddc8) at /home/jeffr/src/NVIDIA-OptiX-SDK-7.1.0-linux64-x86_64/SDK/optixHello/optixHello.cpp:130
(gdb) frame 2
#2  0x000055555555cb41 in optixDeviceContextCreate (context=0x7fffffffd0f0, options=0x7fffffffd1e0, fromContext=0x0)
at /home/jeff/src/NVIDIA-OptiX-SDK-7.1.0-linux64-x86_64/include/optix_stubs.h:277
277	    return g_optixFunctionTable.optixDeviceContextCreate( fromContext, options, context );

I get similar results (all crashing in the context creation) for the other samples.

I have verified that my CUDA installation is working (I can compile and run the CUDA sample code)

I’m not using Linux but by my first knee-jerk reaction would be to try setting the environment variable CUDA_VISIBLE_DEVICES to either 0 or 1 to make only one of the GPUs visible to CUDA at a time.

Normally that shouldn’t matter with OptiX 7 because that doesn’t know about multi-GPUs at all. That would all be managed by native CUDA host code inside the application and the examples you cite are not written to use multiple GPUs.

Also the GPUs are both Pascal GP102 so there shouldn’t be other incompatibilities.

Since it finds the libnvoptix.so.1 library that means the display driver seems to be installed correctly.
There is another driver component libnvidia-rtcore.so which is required as well. Make sure that is found.

If that’s not it, I’d leave this for Linux experts to answer.

(Side note: There is an additional pitfall for OptiX applications in a multi-GPU setup when using OpenGL interoperability for the final display of the image which I don’t expect to be fast or work at all for some cases if the interop resource is not residing on the same device. Check the --no-gl-interop option inside the OptiX SDK 7.1.0 examples. if there are display issues once the crash is solved.)

Looking at the results of running strace on the samples, I found that my system was missing the symlink from /usr/lib/x86_64-linux-gnu/libcuda.so.1 to /usr/lib/x86_64-linux-gnu/libcuda.so. I’m not sure how this happened, but it must have occured when I upgraded CUDA.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.