Thanks! I’ve just sent you the project link.
It looks like whatever I do, even if it fixes any visual errors on 10/20/30/40-series, still causes the “illegal instruction” crash on 50-series, no matter the input data or anything. Except for the loop unrolling. Kinda out of ideas. Tried replacing optixTrace with optixTraverse and OPTIX_RAY_FLAG_DISABLE_ANYHIT | OPTIX_RAY_FLAG_DISABLE_CLOSESTHIT | OPTIX_RAY_FLAG_TERMINATE_ON_FIRST_HIT but 5080 still hits the illegal instruction. Will try OptiX-IR now.
Validation mode on a 5080 throws this (and nothing on older GPUs):
[2][ERROR]: Error syncing stream (CUDA error string: an illegal instruction was encountered, CUDA error code: 715)
Error recording resource event on user stream (CUDA error string: an illegal instruction was encountered, CUDA error code: 715)
Error recording resource event on user stream (CUDA error string: an illegal instruction was encountered, CUDA error code: 715)
Error launching work to RTX
UPDATE: Compiled to Optix-IR, and it works fine on my 4070Ti. However, on a 5080 it now fails when trying to create the module:
[2][COMPILER]: COMPILE ERROR: Module compilation failed
Info: Module Statistics
payload values : 1
attribute values : 0
Info: Properties for entry function “__raygen__oxMain”
semantic type : RAYGEN
trace call(s) : 3
continuation callable call(s): 0
basic block(s) : 151
instruction(s) : 1766
Info: Compiled Module Summary
non-entry function(s): 0
basic block(s) : 0
instruction(s) : 0
I guess it’s the same “illegal instruction”, just caught earlier.
Compiling Optix-IR with –use_fast_math -Wno-deprecated-gpu-targets again causes “an illegal instruction was encountered”.
Update 2: OK, I think I found the culprit. So, in OptiX6 there was rtTextureSamplerGetId which returned an int. This int, when printed out, also looked rather small, like a context-specific texture number (0, 1, 2, 3…). Because the numbers are usually in a small range I did an evil thing of casting it to a float in my light structure just so I can reuse the same value for some light types where it actually needs to be a float and cast it back to int when it should be a texture ID… I should’ve at least reinterpreted it, but I didn’t because I was sure it’s not a problem in such value range.
After moving to OptiX9 I replaced most of these ints with cudaTextureObject_t, which is 64-bit. Except for this structure because it was so nice and tight, and funnily enough texture values returned from cudaCreateTextureObject were also the same tiny numbers! I figured GetId was returning the same thing and there is maybe, probably, possibly no good reason for it to really be 64-bit…
And apparently my “maybe” somehow held up on everything until 50-series. Is the driver silently replacing CPU cudaTextureObject_t values with real pointers now? Or is the float-to-u64 conversion different now? I don’t know, and in fact, just replacing the float with an int stopped 5080 crashes. But I’m adding a proper cudaTextureObject_t to the structure now, not taking chances anymore.
It still puzzles me that it still worked on a 5080 either if I removed all optixTrace calls or unrolled the loop.