Compile error in debug

Hi,

I’m working with OptiX 7.1 and encountered an issue in the OptiX JIT compiler when used in conjunction with nvcc debug infos. The problem occurred when trying to use Nsight Visual Studio Edition to debug my OptiX code. I used the advised nvcc flag “-G” to generate my ptx (see post Debugging Optix 7.1 with Nsight), and then set the module compile options optLevel and debugLevel to OPTIX_COMPILE_OPTIMIZATION_LEVEL_0 and OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO respectively. When doing so though, the function optixModuleCreateFromPTX fails with the following error message:

COMPILE ERROR: Malformed PTX input. See compile details for more information. Error: Found an indirect function call in function : path/to/my/cuda/file.cu:line#:char#

This error points to a variable declaration of type MyStruct structs_buffer[10], where MyStruct is defined as follows:

struct MyStruct {
    __device__ MyStruct() {}
    float field0, field1;
};

If on the other hand I declare a single variable of type MyStruct, the JIT compiler doesn’t complain. As mentioned, this only happens with the -G flag, no matter what the OptiX compile options are. After digging into the ptx, I noticed that the code generated by nvcc when using the -G compiler flag was indeed indirectly calling the default constructor of MyStruct, whereas when creating a single variable, a simple direct call is made (qhich is kind of the obvious choice I would have guessed). Here is the concerning ptx code for the indirect call case:

// ... code before
mov.u64 	%rd10, _ZN9Imath_2_46Color3IfEC1Ev;
mov.u64 	%rd11, %rd10;
st.u64 	[%SP+0], %rd11;
// ... later
ld.u64 	%rd16, [%SP+0];
.param .b64 param0;
st.param.b64	[param0+0], %rd18;
prototype_210 : .callprototype ()_ (.param .b64 _) ;
call 
%rd16, 
(
param0
)
, prototype_210;

So my first question is, why on earth does nvcc generates such crazy code when compiling with debug infos? I understand that it doesn’t do any optimization when using the -G flag, but it seems quite crazy to generate such code in the first place. It might not be an issue when debugging plain cuda, but since OptiX doesn’t support indirect calls, it is quite problematic.

And my other question would be how to fix this on my side ? I can’t step through my program without the -G flag, but maybe there is another nvcc flag that would prevent indirect calls from being generated ?

My current workaround for this is to bypass the default constructor from being called on the array:

char plain_buffer[10 * sizeof(MyStruct)];
MyStruct* structs_buffer = (MyStruct*)plain_buffer;

But you’ll admit that this is quite ugly and error prone…

I hope this can be fixed and that inspecting locals will soon be available in OptiX, but it is already great to be able to step through the program. :)

Cheers,
Benoit R.

OptiX still doesn’t handle PTX files translated with the NVCC device code debug information --device-debug or -G.
That is being worked on. Please compile all OptiX PTX targets without these NVCC debug options.

For Nsight Compute profiling, use OPTIX_COMPILE_OPTIMIZATION_LEVEL_3 and OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO.

For Nsight Visual Studio Edition or cuda-gdb debugging use OPTIX_COMPILE_OPTIMIZATION_LEVEL_0 and OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO.

The OptiX input PTX source is not really representative of what the final SASS microcode is actually doing.
OptiX contains a PTX parser and compiler and effectively rewrites the PTX input. After that there is the CUDA PTX assembler and microcode generator inside the CUDA driver.
The line-info is working good enough to see what SASS code comes from what CUDA line inside the Nsight Compute profiler.
As said, better debugging support for OptiX 7 is being worked on.

BTW, which CUDA Toolkit version are you using?

Generally the newer the OptiX SDK version and display driver, the better. Maybe upgrade to OptiX 7.2.0 which support CUDA 11.1 and requires 456.71 display drivers under Windows resp. 455.28 under Linux.

I’m not able to step through the program when just using the OptiX compile flags. I’m currently using Cuda 10.1. I will see if upgrading to OptiX 7.2 helps, thanks for the quick reply.

No, it won’t. I was just saying that because if you complain about the compiled PTX code, that is purely CUDA toolkit related and with OptiX 7.2.0 you’d be on the safe side with the newest available CUDA 11.1 version which might generate different code.
As said full CUDA debugging functionality has never been supported in OptiX and is still being worked on.