Hi,
I’m working with OptiX 7.1 and encountered an issue in the OptiX JIT compiler when used in conjunction with nvcc debug infos. The problem occurred when trying to use Nsight Visual Studio Edition to debug my OptiX code. I used the advised nvcc flag “-G” to generate my ptx (see post Debugging Optix 7.1 with Nsight), and then set the module compile options optLevel and debugLevel to OPTIX_COMPILE_OPTIMIZATION_LEVEL_0
and OPTIX_COMPILE_DEBUG_LEVEL_LINEINFO
respectively. When doing so though, the function optixModuleCreateFromPTX
fails with the following error message:
COMPILE ERROR: Malformed PTX input. See compile details for more information. Error: Found an indirect function call in function : path/to/my/cuda/file.cu:line#:char#
This error points to a variable declaration of type MyStruct structs_buffer[10]
, where MyStruct
is defined as follows:
struct MyStruct {
__device__ MyStruct() {}
float field0, field1;
};
If on the other hand I declare a single variable of type MyStruct
, the JIT compiler doesn’t complain. As mentioned, this only happens with the -G
flag, no matter what the OptiX compile options are. After digging into the ptx, I noticed that the code generated by nvcc when using the -G
compiler flag was indeed indirectly calling the default constructor of MyStruct
, whereas when creating a single variable, a simple direct call is made (qhich is kind of the obvious choice I would have guessed). Here is the concerning ptx code for the indirect call case:
// ... code before
mov.u64 %rd10, _ZN9Imath_2_46Color3IfEC1Ev;
mov.u64 %rd11, %rd10;
st.u64 [%SP+0], %rd11;
// ... later
ld.u64 %rd16, [%SP+0];
.param .b64 param0;
st.param.b64 [param0+0], %rd18;
prototype_210 : .callprototype ()_ (.param .b64 _) ;
call
%rd16,
(
param0
)
, prototype_210;
So my first question is, why on earth does nvcc generates such crazy code when compiling with debug infos? I understand that it doesn’t do any optimization when using the -G
flag, but it seems quite crazy to generate such code in the first place. It might not be an issue when debugging plain cuda, but since OptiX doesn’t support indirect calls, it is quite problematic.
And my other question would be how to fix this on my side ? I can’t step through my program without the -G
flag, but maybe there is another nvcc flag that would prevent indirect calls from being generated ?
My current workaround for this is to bypass the default constructor from being called on the array:
char plain_buffer[10 * sizeof(MyStruct)];
MyStruct* structs_buffer = (MyStruct*)plain_buffer;
But you’ll admit that this is quite ugly and error prone…
I hope this can be fixed and that inspecting locals will soon be available in OptiX, but it is already great to be able to step through the program. :)
Cheers,
Benoit R.