Undefined behavior in device code

I cannot access that pastebin links due to “This site cannot be reached.” errors. That might be blocked by our IT.
Please attach the files (e.g. as *.zip) directly to the forum itself with the “Upload” button inside the editor toolbar (the icon with the up arrow).

I cannot upgrade the CUDA version.

Although OptiX 7.3.0 is built with CUDA 11.1 the programming interface is PTX source, so any PTX OptiX can parse will work. It’s just that it sometimes cannot parse unexpected PTX code constructs from newer compilers.
Since OptiX 7 core implementation lives inside the display driver, newer driver versions might have solved that. (Not sure, I’m still using CUDA 11.1.) There is also CUDA 11.4 already and the CUDA driver in 471.41 already supports that version.

You could give it a try, but it can also be that the input PTX code hasn’t much to do with that. The input PTX code gets translated and rewritten to the actual kernel and the CUDA driver will translate and optimize the kernel to the final microcode. Means there are multiple compilation and code generation steps involved and any of them cold be responsible for what you’ve hit.

You can have arbitrary many CUDA Toolkits installed on your development machine and select one with the CUDA_PATH environment variables. Do not install the display drivers from the CUDA toolkits. Recently explained here:
https://forums.developer.nvidia.com/t/invalidaddressspace-when-using-pointer-from-continuation-callable-parameters/184951/4

There is no optixReportIntersection in the original intersection program. I just directly perform calculations in the intersection program. That pipeline has no any- or closest-hit programs.

Ok, I didn’t realize that. I was expecting more standard behavior.