Acceleration + Exceptions Issues

Hello all,
First time poster and occasional reader of this forum. I have been using optiX for a couple of months.

I am writing because I am having some issues with acceleration structures that I don’t understand and would like some help.
Namely, different behavior depending on the acceleration structure used, the refit property, and exceptions enabled.
By way of summary, see this image

The scene set-up:

//list of points Npt each [x,y,z]
  vert_ = context_->createBuffer(RT_BUFFER_INPUT, RT_FORMAT_FLOAT3, mesh->numVert());

  //connectivity Ntri each [ind0, ind1, ind2, ID]
  ind_ = context_->createBuffer(RT_BUFFER_INPUT, RT_FORMAT_INT4, mesh->numTri());

  //the mesh is already on device 
  //with correct types, list of 3 floats (packed), list of 4 int (packed)
  vert_->setDevicePointer(0, mesh->get_pts().ptr_);
  ind_->setDevicePointer(0, mesh->get_tri().ptr_);

  //create geometry (all primitives in one)
  geo_ = context_->createGeometry();

  /*these are the bounding box and intersection programs that come with SUTIL SDK
   (with slight modification to allow for indices being int4 instead of int3 
    and keep intersecting primitive index) */

  geoInst_ = context_->createGeometryInstance();
  /* material doesn't have anything special, it associates ray type and uses the primitive index 
    variable to look-up the triangle color in the closest_hit program */
  geoInst_->setMaterial(0, mat_);

  geoGroup_ = context_->createGeometryGroup();
  accel_ = context_->createAcceleration("Trbvh");

  //index buffer stride to account for the <ID>
  accel_->setProperty("index_buffer_stride", "4");
  accel_->setProperty("refit", "1");
  //accel_ = context_->createAcceleration("NoAccel");

All entry points are associated with an exception program in the form of:

RT_PROGRAM void exception()
  const unsigned int code = rtGetExceptionCode();
  rtPrintf("Caught exception 0x%X at launch index (%d,%d)\n", code, launch_index.x, launch_index.y);
  output_buffer[launch_index] = make_color(OPTIX_BAD_COLOR); //BAD_COLOR = RED

The test:
I have been using this code for a couple of months now, using TRBVH acceleration with refit and with exceptions turned off (mostly ‘release’ builds). The code seems to be stable, with no undefined behavior or crashes. [top left image in summary image] I recently switched to a ‘debug’ build (which enabled ALL exceptions and printing) when I noticed that I had no output from the raytracer [bottom left image in the summary]. There were no crashes and the exception program produce no visible BAD_COLOR pixels or print outs).

I tried a couple of other configurations to figure out what was causing the issue. But have not been able to troubleshoot the problem very well.

The fact that No-Acceleration produces the correct results make me think that somehow the issue is with the acceleration builds. I do have a stride in the index buffer, but I think its being handled correctly.

I find it odd that exceptions enabled/disable change the behavior. While using TRBVH with refit one, I tried turning on each exception type individually, all but RT_EXCEPTION_BUFFER_INDEX_OUT_OF_BOUNDS produced the expected renders. I have not been able to figure out if (or where) I am indexing out of bound but it’s odd that there are no outputs (pixels or print outs) from the exception program.

Any ideas what could be causing this problem? Or how to go about troubleshooting it?

The configuration: Windows 7 x64, VS2015, CUDA 8, OptiX 4.1.1, GTX 1080 Ti

You’re on the right track, but the last assumption is wrong.

With this format of the index_buffer
ind_ = context_->createBuffer(RT_BUFFER_INPUT, RT_FORMAT_INT4, mesh->numTri());
this is the incorrect stride
accel_->setProperty(“index_buffer_stride”, “4”);
It should be
accel_->setProperty(“index_buffer_stride”, “16”);

I think the description inside the OptiX Programming Guide Table 4 Acceleration Structure Properties is misleading there.

Mind that strides are in bytes. The index_buffer format is assumed to be int3 and which means a stride of 12 bytes between each such int3 triangle indices. The default value of 0, which means tightly packed int3 values, is effectively using 12 bytes as stride.
Smaller values than 12 are not possible in these acceleration properties, so with your value of 4 OptiX was using 12 as stride internally and each 4th triangle in your structure came out correctly. Depending on the contents of the ID field in the .w component the other were more or less undefined, including possible out of bounds accesses in the vertex buffer which explains the exceptions.

Since you’re using int4 (as do my own renderers for exactly the same purpose) the stride in bytes between two triangle indices should be 16 in that acceleration property.

That behaves similar to interleaved vertex attributes in an OpenGL vertex array.

In hindsight these default buffer names shouldn’t exist, because using the default buffer names vertex_buffer or index_buffer for non-triangle primitives calls for trouble.
I normally name my attribute and index buffers differently to be absolutely sure that no special cases of the faster acceleration structure builders kick in, unless I want them to, and only for those I explicitly set the names and strides in the acceleration structure properties.

Optimization tip: If your root traversal object is the same for all rtTrace() calls, there is no need for separate object variables on devices side. Means with top_object == top_shadower the latter is redundant.

(Note that there is no Titan 1080 Ti. I assume you meant GeForce GTX 1080 Ti.)

Thank you, Detlef.

Your explanation makes sense. I misinterpreted the 0 bytes stride in the programming guide when using int3. Assuming that int4 would just be the “extra” 4 bytes offset. I am still wondering why it “worked” in the first case to begin with?

Using a 16 byte stride does indeed fix the middle column of the summary image.
However, I am still getting no optix output when refit is on and exceptions are enabled.

I tried a couple of things (all with correct stride, and exceptions enabled):

  • Using Sbvh or NoAccel produce the expected output.
  • Using Trbvh with refit off produce the expected output.
  • Using Trbvh with refit on produce no output.
  • Initializing Trbvh with refit off (correct output) then toggling refit to on stop producing output (subsequent toggles, e.g. back to off/on continue to produce no output).

Could this be an indicator that something else is miss-specified?

I figure I would try this issue on the SDK examples and see if it happened there. I went to the optixMeshViewer example and added the following after line 196 of optixMeshViewer.cpp

context->setExceptionEnabled(RT_EXCEPTION_ALL, true);
geometry_group->getAcceleration()->setProperty("refit", "1");

The result is the same as with my code.
Individually the settings work as expected, but combined they produce no output.

[Thanks for catching the incorrect card and the optimization tip, one less variable!]


Ok, thanks a lot for testing that in an SDK example.
That’s easier to work with than an OptiX API Capture (OAC).
I’ll file a bug report with that information for the OptiX team to investigate.

Thanks for all the help. :)


“Smaller values than 12 are not possible in these acceleration properties”

This sounds very sad. I am using optix for a game engine which uses short3 as their index buffer format. It seems, because of this limitation, I will have to copy all indices instead of just sharing them.

It would be great, if short3 could be supported. Any chance it could happen?


BTW. I tried to set index_buffer_stride to 6 for TRBVH, in execution, it failed as:
caught exception: Invalid index buffer size. Possible reasons: incorrect primitive count or stride too large.

You can use whatever indexing and format you like in OptiX!
You provide the bounding box and intersection programs which decode index and attribute data yourself.

It’s just that you cannot use the specialized builders for triangles if your data formats are different. That’s all.

Means simply name your vertex and index buffers differently than the special names expected by the acceleration properties, so don’t name them “vertex_buffer” and “index_buffer” and don’t call the rtAccelerationSetProperty functions for any of the names and strides of these and OptiX will build the BVHs by calling into your provided bounding box program. Simple as that.
This is required for any primitive types which are not triangles as well.

(And I hope that game engine uses at least unsigned short indices, because signed shorts would be a waste of range.)

Thank you very much for your help, Detlef.

Yes, it uses unsigned short3, not short3. I was just reference to their 6 bytes size, sorry for the confusion and thank you for the correction :)