sdk example output to file

Hi,

I’m new to optix learning by the SDK examples.
I ssh into a remote machine to access NVIDIA GPU. I can compile smoothly.

But ./sample1 will give

Xlib: extension “NV-GLX” missing on display “localhost:10.0”.
X Error of failed request: BadAlloc (insufficient resources for operation)
Major opcode of failed request: 154 (GLX)
Minor opcode of failed request: 24 (X_GLXCreateNewContext)
Serial number of failed request: 29
Current serial number in output stream: 30

I searched google and find some old posts about this but none of them helped.

In addition, I hope at least I can view the output image from --file option but that doesn’t work either.

OptiX Error: Invalid context (Details: Function “RTresult _rtContextCompile(RTcontext)” caught exception: Unable to set the CUDA device., [3735714])
(/home/jlmiao/NVIDIA-OptiX-SDK-3.6.0-linux64/SDK/sample1/sample1.c:108)

Is the --file option intended for image output?

Thanks!

Sounds like your remote connection does not have any access to OpenGL which would make most OptiX SDK examples fail because they render with OptiX via GLUT.

In sample1 the --file option should disable the GLUT code, which you can find inside the sample1’s sources. Since that also fails because no CUDA device could be set, I would say your remoting solution has no access to the GPU at all.
(Something which also happens with Windows Remote Desktop, BTW.)

To confirm that, check what happens when running OptiX SDK sample3 which only queries the CUDA device.
You could also run the device query example in the CUDA toolkit examples.

If it works if you’re logged in locally on that system, the remoting software would be the issue.
You could try VNC instead.

Could also be that the display driver is too old, but so far it points to the remoting as culprit.

Generally please provide the following information when reporting issues:
OS version, OS bitness, installed GPU(s), display driver version, OptiX version, CUDA version.

The remoting machine does have GPU with which I usually carry out scientific computation.

— General Information for device 0 —
Name: GeForce GTX 480
Compute capability: 2.0
Clock rate: 1401000
Device copy overlap: Enabled
Kernel execution timeout : Enabled
— Memory Information for device 0 —
Total global mem: 1609760768
Total constant Mem: 65536
Max mem pitch: 2147483647
Texture Alignment: 512
— MP Information for device 0 —
Multiprocessor count: 15
Shared mem per mp: 49152
Registers per mp: 32768
Threads in warp: 32
Max threads per block: 1024
Max thread dimensions: (1024, 1024, 64)
Max grid dimensions: (65535, 65535, 65535)

— General Information for device 1 —
Name: Tesla C2070
Compute capability: 2.0
Clock rate: 1147000
Device copy overlap: Enabled
Kernel execution timeout : Disabled
— Memory Information for device 1 —
Total global mem: 6442123264
Total constant Mem: 65536
Max mem pitch: 2147483647
Texture Alignment: 512
— MP Information for device 1 —
Multiprocessor count: 14
Shared mem per mp: 49152
Registers per mp: 32768
Threads in warp: 32
Max threads per block: 1024
Max thread dimensions: (1024, 1024, 64)
Max grid dimensions: (65535, 65535, 65535)

running sample3 gives similar information

OptiX 3.6.0
Number of Devices = 2

Device 0: GeForce GTX 480
Compute Support: 2 0
Total Memory: 1609760768 bytes
Clock Rate: 1401000 kilohertz
Max. Threads per Block: 1024
SM Count: 15
Execution Timeout Enabled: 1
Max. HW Texture Count: 128
TCC driver enabled: 0
CUDA Device Ordinal: 0

Device 1: Tesla C2070
Compute Support: 2 0
Total Memory: 6442123264 bytes
Clock Rate: 1147000 kilohertz
Max. Threads per Block: 1024
SM Count: 14
Execution Timeout Enabled: 0
Max. HW Texture Count: 128
TCC driver enabled: 0
CUDA Device Ordinal: 1

Constructing a context…
Created with 2 device(s)
Supports 2147483647 simultaneous textures
Free memory:
Device 0: 1464328192 bytes
Device 1: 6373621760 bytes

$ uname -a
Linux smithwick 3.5.0-51-generic #76-Ubuntu SMP Thu May 15 21:19:10 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2012 NVIDIA Corporation
Built on Thu_May__9_18:45:33_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0

Thanks for your advice, I’ll try VNC.

My final goal is not rendering but use Optix to assist particle transportation. I tried disable displaying image

/* Display image */
    /*
    if( strlen( outfile ) == 0 ) {
      RT_CHECK_ERROR( sutilDisplayBufferInGlutWindow( argv[0], buffer ) );
    } else {
      RT_CHECK_ERROR( sutilDisplayFilePPM( outfile, buffer ) );
    }
    */

But I still get errors. What should I do further to disable rendering?

Sorry, what errors exactly?

If the examples don’t work on your system setup because they use OpenGL, I would start with an empty “int main(int argc, char *argv[])” program and add OptiX calls one by one.

E.g. start with the calls used in sample3 to determine the CUDA devices. Once you have that the first step is to create the OptiX context, then manually select one of the CUDA devices, then delete the context again. And so forth.

If that first step works, you should be able to build an OptiX program from the ground up which works on your system setup by following the simple examples and the tutorial. Just skip everything GLUT and Scene class related, roll your own.
I recommend to use the OptiX C++ wrappers which makes detecting errors automatic. sample5 and sample5pp introduce that, both are the same example just using the OptiX native C API and then the C++ convenience wrappers in optixu/optixpp_namespace.h

Error:

$ ./bin/sample1
freeglut (./bin/sample1): ERROR: Internal error in function fgOpenWindow
X Error of failed request: BadWindow (invalid Window parameter)
Major opcode of failed request: 4 (X_DestroyWindow)
Resource id in failed request: 0x0
Serial number of failed request: 20
Current serial number in output stream: 23

Does that mean the error comes from OpenGL?

A general question about c++ and C, will C++ wrapper damage performance? Performance matters to my project.

As posted before, sample3 works normally.

In sample1, disabling displaying image only gives error as

Then if I turn off use_glut,

int use_glut = 0;//1;

new error comes as

$ ./bin/sample1
OptiX Error: Invalid context (Details: Function “RTresult _rtContextCompile(RTcontext)” caught exception: Unable to set the CUDA device., [3735714])
(/home/jlmiao/NVIDIA-OptiX-SDK-3.6.0-linux64/SDK/sample1/sample1.c:116)

Why I can get devices in sample3 but not in sample1?

You shouldn’t be concerned about that. Use what you like, the C++ wrappers are just more convenient.
The bottleneck with GPU ray tracing is normally on the GPU or the data transfers to and from the GPU, not how you call the API.

  • sample3 is basically using rtDevice* functions which can be used before creating an OptiX context. That’s similar to what the CUDA device-query example does.
    It also creates an OptiX context at the end and does some queries, the free memeory output), so that works.
  • sample1 is a really doing some ray tracing and the error code just indicates that cudaSetDevice() failed. I’m neither using Linux nor remoting, so I can’t really help further.

I would still recommend to do the described experiments with an own standalone application and add tiny things one at a time, e.g. recreating sample1, just not linking against GLUT or OpenGL or the sutil libraries.

Also the part with only selecting one(!) of the two GPUs might interest you. OptiX will pick all boards of the highest compatible streaming multiprocessor version if you do not select the devices manually with rtContextSetDevices(). Mixing different boards with compatible streaming multiprocessor versions like in your setup will limit OptiX to the lower GPU memory configuration.

If the problem persists a minimal reproducer with all necessary steps to reproduce the issue would be required to be able to file a bugreport.

The usual information should always contain these details:
OS version, OS bitness, installed GPUs, display driver version, OptiX version, CUDA version.

Now I’m not remoting. However sample4(to rule out changes I made to sample1) and sample1 give the same error.

$ ./bin/sample1
OptiX Error: Invalid context (Details: Function “RTresult _rtContextCompile(RTcontext)” caught exception: Unable to set the CUDA device., [3735714])
(/home/jlmiao/NVIDIA-OptiX-SDK-3.6.0-linux64/SDK/sample1/sample1.c:116)

$ ./bin/sample4
OptiX Error: Invalid context (Details: Function “RTresult _rtContextCompile(RTcontext)” caught exception: Unable to set the CUDA device., [3735714])
(/home/jlmiao/NVIDIA-OptiX-SDK-3.6.0-linux64/SDK/sample4/sample4.c:137)

OS version and bitness
$ uname -a
Linux smithwick 3.5.0-51-generic #76-Ubuntu SMP Thu May 15 21:19:10 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

CUDA version
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2012 NVIDIA Corporation
Built on Thu_May__9_18:45:33_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0

Optix version and installed GPUs

OptiX 3.6.0
Number of Devices = 2

Device 0: GeForce GTX 480
Compute Support: 2 0
Total Memory: 1609760768 bytes
Clock Rate: 1401000 kilohertz
Max. Threads per Block: 1024
SM Count: 15
Execution Timeout Enabled: 1
Max. HW Texture Count: 128
TCC driver enabled: 0
CUDA Device Ordinal: 0

Device 1: Tesla C2070
Compute Support: 2 0
Total Memory: 6442123264 bytes
Clock Rate: 1147000 kilohertz
Max. Threads per Block: 1024
SM Count: 14
Execution Timeout Enabled: 0
Max. HW Texture Count: 128
TCC driver enabled: 0
CUDA Device Ordinal: 1

I’m not sure I’m using rtContextSetDevices(context, count, context_devices) correctly.
What I can learn from sample3 is

context_devices = (int*)malloc(sizeof(int)*context_device_count);
  RT_CHECK_ERROR(rtContextGetDevices(context, context_devices));

So ‘context_devices’ is the list containing all the devices. How can I select?
I think the problem is in that

RT_CHECK_ERROR(rtContextCreate(&context));

created with all the devices. How can I select in this phase?

I tried

cudaSetDevice(0)

before create contex.

But I got error of “undefined reference to cudaSetDevice()”. What’s wrong?

You don’t even need to use context_devices. You should, but for a simple test you can just assign it:

int deviceId = 1; // Use device 1 (Your Tesla C2070)
context->setDevices( &deviceId, &deviceId+1 );

This is a CUDA API call, not OptiX.

Must I use c++ wrapper to use the

“->setDevices()”,

What alternative can I use in C?

I tried the c++ version. Yes, it is more convenient to use.

But I still get the error

OptiX Error: Invalid context (Details: Function “RTresult _rtContextCompile(RTcontext)” caught exception: Unable to set the CUDA device., [3735714])

Could you help me?

Oh, no, you don’t need to use the c++ wrapper. I just prefer it…

int deviceId = 1; // or zero, depending on which device you wanted to use
rtContextSetDevices( context, 1, &deviceId );

EDIT- oops, didn’t see there was a second page.

Thanks a lot. Your example code helps understand what the “devices” of rtContextSetDevices() input really means.

By the way, I tried sample5pp with the C++ wrapper. I got the error of “unable to set the cuda device”.
Now I’m remoting to the GPU machine again. The C version gives error like

$ ./bin/sample1
freeglut (./bin/sample1): ERROR: Internal error in function fgOpenWindow
X Error of failed request: BadWindow (invalid Window parameter)
Major opcode of failed request: 4 (X_DestroyWindow)
Resource id in failed request: 0x0
Serial number of failed request: 20
Current serial number in output stream: 23

I’ll see later when I login directly to the GPU later.

The C version of “->setDevices()” is “rtContextSetDevices()”. You can find details in the documentation.

In general, the C++ wrapper is just that: a wrapper. Anything you can do with it, you can also do without it.

The error you get when remoting is an OpenGL error (from freeglut - see the “gl” in the name?). It’s probable that you don’t have access to OpenGL functionality when you remote.

Thanks for pointing out the error of remoting. Then any idea about the error in “unable to set cuda device”?