Assertion failed: "instr->modifiers.CC == 0"

I just installed OptiX 3.5.1 on several machines. On the Quadro and Tesla (both still using CUDA 5.0) it works fine out of the box. However, I also installed it on a machine that already had CUDA 5.5 installed, and I get the following error:

Starting OptiX 3051 on 1 devices:
Device 0: GeForce GTX 260 with 24 multiprocessors, 512 threads per block, 1363000 kHz, 939524096 bytes global memory, 128 hardware textures, compute capability 1.3.

OptiX Error: Unknown error (Details: Function "_rtContextCompile" caught exception: Assertion failed: "instr->modifiers.CC == 0", [1312613])

Any ideas how to solve this issue? I’ve already tried installing the latest graphics drivers.

Im getting the same error with GTX 670. Optix 3.5.1 seems to work fine with CUDA 5.0, but with 5.5 the error occurs…

Where does it happen? A custom application or on one of the SDK samples?

A custom aplication. Was something maybe depricated?

Using --use_fast_math as an argument to nvcc usually fixes this issue.

Please try that and report if it doesn’t fix your issue.

Yes, i recently added a new project to my VS solution and it didnt have --use_fast_math. But adding it didn’t fix the issue…

CUDA 5.5 and Optix 3.5.1 support vs110 toolkit, right?

Yes, vs110 is supported (VS2012) by 3.5.1, also by CUDA 5.5. It must be something else. So have you recompiled all your files with --use_fast_math and that didn’t work?

Yes, i rebuilt the entire solution and it didnt help. Also, the project that didnt have fast math doesnt even have any cuda files to compile.

Ill try to explain my situation thoroughly:

I have cuda 5.5, cuda 5.0, optix 3.0.1, optix 3.5.1, vs100 and vs110 currently installed.

cuda 5.0 + optix 3.5.1 + vs100 works great
cuda 5.5 + optix 3.5.1 + vs110 throws exception ("Assertion failed: “instr->modifiers.CC == 0”) on context launch

Also, all optix 3.5.1 samples work great

I don’t get this, do you have PTXes to be used with OptiX?

What i am building is a library in optix, and the library project handles and compiles all the cuda stuff.
What i was refering to was a tester project for the library, it didnt have fast math.
But thats irrelevant, now everything has --use_fast_math flag :)

Time for some debugging :)

Can you please figure out what instruction/part of the code in your application is causing that exception? If that doesn’t help contact optix-help@nvidia.com and we’ll set up a trace recording to do the debugging

accel->setProperty("index_buffer_name", "index_buffer");
accel->setProperty("vertex_buffer_name", "vertex_buffer");
accel->markDirty();
geometrygroup->setAcceleration(accel);
ctx->launch(0, 0, 0);

ctx->launch(0, 0, 0); throws the exception
This is from the part of code that compiles the scene graph.
The snippet is for accel caching (the exception is thrown from ctx->compile() if i turn off the accel caching)

I am using 32 bit optix and 64 bit cuda 5.5, could this be a problem? If it can, why dont the optix samples mind, and why does the project compile (im dealing with a runtime exception)?
Also i tried installing 32bit cuda, but the installer wont proceed because my machine is 64bit, wich i find very strange…

We’ll see what optix-help has to say :D

I should have been more clear: launching a context or compiling it obviously triggers the exception since the compilation process is launched, I meant: can you figure out, except for context launch and compiling, which part of your code is generating the problem?

I believe by default the OptiX samples are 64-bit. Can you try compiling your project in 64 bit mode? Another way to rule out the range of exceptions

I couldn’t find the exact instruction, but i tried to compile a barebone context (minimal context that can compile) and it gave the same error.

Also, if i try to install 32 bit cuda 5.5 the installer says the following:

32 bit cuda toolkit cant be installed on a non-32-bit computer.
This nvidia graphics driver is not compatible with this version of windows

Problem solved, I’m writing that down for others to read: check that not only the “use fast math” in the VS options is set but that the --use_fast_math is actually being issued to the nvcc command line.

Adding --use_fast_math fixed the compilation error for me. I would like to know how I can use IEEE division and square roots with OptiX in CUDA 5.5, though.

Hi,

I have a similar problem.
How can I check if --use_fast_math is issued in a vs project after setting “use fast math” option?

The command line generated by VS is as follows:

(Approximate command-line. Settings inherited from host are not visible below.)

(Please see the output window after a build for the full command-line)

Driver API (NVCC Compilation Type is .cubin, .gpu, or .ptx)

set CUDAFE_FLAGS=–sdk_dir “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A”
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\bin\nvcc.exe” --use-local-env --cl-version 2010 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\x86_amd64” --keep-dir x64\Release -maxrregcount=0 --machine 64 -ptx -cudart static -o path_tracer.dir\Release\path_tracer.cu.ptx “D:\swl\Project\code\Optix\path_tracer\path_tracer.cu”

Runtime API (NVCC Compilation Type is hybrid object or .c file)

set CUDAFE_FLAGS=–sdk_dir “C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A”
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\bin\nvcc.exe” --use-local-env --cl-version 2010 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\x86_amd64” --keep-dir x64\Release -maxrregcount=0 --machine 64 -ptx -cudart static -use_fast_math -Xcompiler "/EHsc /nologo /Zi " -o path_tracer.dir\Release\path_tracer.cu.ptx “D:\swl\Project\code\Optix\path_tracer\path_tracer.cu”

It seems that the --use_fast_math flag does not apply to generating ptx files even if I set it in VS options.

I use a cmake file (adapted from one of the files in the SDK) and added this section:

if(NOT PASSED_FIRST_CONFIGURE)
  set(flag "--use_fast_math")
  list(FIND CUDA_NVCC_FLAGS ${flag} index)
  if(index EQUAL -1)
    list(APPEND CUDA_NVCC_FLAGS ${flag})
    set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} CACHE LIST "Semi-colon delimit multiple arguments." FORCE)
  endif()
endif(NOT PASSED_FIRST_CONFIGURE)