The very first thing I recommend is to run with the environment variable CUDA_LAUNCH_BLOCKING set to 1. This will implicate or rule-out async issues. If the error goes away, then everything is probably setup correctly and you just need to figure out where to synchronize.
Next I’d suggest putting a cudaDeviceSynchronize() call and error check before the optixLaunch to make sure the problem actually occurs during the launch and wasn’t carried over from before the launch.
The next thing to try is turn on OptiX validation mode, to see if it reports anything. https://raytracing-docs.nvidia.com/optix7/guide/index.html#context#validation-mode
If you’re on Windows, you could try compiling everything in debug with optimizations disabled, and launch the Nsight Compute debugger from Visual Studio. We are working on the debugging features, and they’re not expected to work seamlessly right now, but you might be able to catch the approximate location of the illegal memory access, so it’s worth a shot.
I don’t know if gdt::vector and glm::vector are the same size in memory, but if so, then the vector type might be a red herring. Are they both equivalent to float3?
If the debugger doesn’t yield any useful info, the next thing to do is isolate the cause. Common causes for your setup might include:
- a misconfigured Shader Binding Table
- a shader reading/writing memory out of bounds
- an OpenGL interop problem
- a misconfigured Acceleration Structure
- a stack overflow
Isolate the cause by disabling OptiX features in your renderer systematically until it runs without the error. (Or if it makes more sense, disable all features and re-enable them systematically until you hit the error.) For example, if you have multiple hitgroups, see if disabling one or more fixes the error. In that case the issue may be SBT or a shader program. To test shaders, comment the trace call(s) from your raygen program and see if the error still occurs (if so, the issue is in miss, closest-hit, or any-hit). Unplug things until you figure out which one is the problem. You can rule out shader code by putting a
return statement at the top, and bisect it by moving the return statement around.
See if you can trigger the error with smaller launch dimensions. It is ideal if you can reproduce when using a 1x1 pixel launch. It may be effective to use
printf() once your launch size is small enough that the amount of printf output is manageable.
See how far you get isolating, and if you get stuck we can toss out a few more ideas about how to dig further. I hope that helps!