[resolved] rtTrace from bindless callable programs

nljones · February 18, 2019, 5:54am

I’m trying to implement my material shaders in bindless callable programs per Detlef’s suggestion. According to the OptiX 6.0.0 release notes, it now supports “rtTrace from bindless callable programs”, which you couldn’t do in the past. However, I haven’t found any documentation or examples of how to do this.

Currently, as long as I don’t call rtTrace in my bindless program, everything works fine. I can make ray and ray payload data structures, pass attributes by value, etc. However, as soon as I put rtTrace into the program, I get strange behavior:

On GTX 1080 Ti, it seems that the entire bindless program is not called (and I can't even force an exception by putting rtThrow into the callable program). The rest of the code runs normally, but the value returned by the callable program is garbage.

On RTX 8000, I get the following error: ``` Unknown error (Details: Function "_rtContextLaunch2D" caught exception: Encountered a CUDA error: cudaDriver().CuEventSynchronize( m_event ) returned (700): Illegal address) ```

Here’s the closest hit code:

RT_PROGRAM void closest_hit_generic() // This is my new closest hit program
{
	if (mat_id >= material_data.size()) return;
	MaterialData mat_data = material_data[mat_id];

	prd.result = ((rtCallableProgramId<float3()>)mat_data.radiance_program_id)();
}

RT_CALLABLE_PROGRAM float3 closest_hit_radiance() // This used to be my closest hit program, now it is callable and stripped down to a minimal example
{
	PerRayData_radiance new_prd;
	new_prd.result = make_float3(0);
	Ray refl_ray = make_Ray(make_float3(1, 0, 10), make_float3(0, 0, 1), RADIANCE_RAY, 1e-4, RT_DEFAULT_MAX);
	rtTrace(top_object, refl_ray, new_prd); // If you comment this line, it "works"
	return new_prd.result;
}

The environment is RTX 8000, Windows driver 418.81, nvcc version 10.0.130

I’ll send a stack trace to the help email.

droettger · February 18, 2019, 9:01am

Yes, this not going to work like that without additional changes and there is no example inside the OptiX SDK, yet.

OptiX cannot always detect automatically if bindless callable programs along a hierarchy of calls contain an rtTrace call which would need additional internal instrumentation to be able to call rtTrace.

For that OptiX added a call site instrumentation which allows to tell OptiX which bindless callable program IDs are potentially calling which others.

This works automatically when holding a bindless callable program ID directly in an rtDeclareVariable.
It also works automatically when using buffers of bindless callable program IDs.
All other cases need additional call site instrumentation on device side and some host side configuration.

First, you need to use rtMarkedCallableProgramId instead of rtCallableProgramId when calling a bindless callable program ID with an rtTrace inside.

Please look into the optix_device.h headers for more information on rtMarkedCallableProgramId.

That rtMarkedCallableProgramId allows to define a call site via a constant string which can be used on the host side inside the newly added function rtProgramCallsiteSetPotentialCallees which allows to specify which bindless callable program IDs are potentially being called from specific rtMarkedCallableProgramId locations inside the device code.
This allows OptiX to instrument the hierarchy of calls with the necessary information to be able to call an rtTrace.

So your code should look something like this:

RT_PROGRAM void closest_hit_generic()
{
	if (mat_id >= material_data.size()) return;
	MaterialData mat_data = material_data[mat_id];

	prd.result = ((rtMarkedCallableProgramId<float3()>)mat_data.radiance_program_id, "my_call_site")();
}

RT_CALLABLE_PROGRAM float3 closest_hit_radiance()
{
	PerRayData_radiance new_prd;
	new_prd.result = make_float3(0);
	Ray refl_ray = make_Ray(make_float3(1, 0, 10), make_float3(0, 0, 1), RADIANCE_RAY, 1e-4, RT_DEFAULT_MAX);
	rtTrace(top_object, refl_ray, new_prd);
	return new_prd.result;
}

// On the host:
Program ch_generic  = context->createProgramFromPTXString(ptx, "closest_hit_generic");
Program cp_radiance = context->createProgramFromPTXString(ptx, "closest_hit_radiance");

// Gather all bindless callable program IDs which can be called from "my_call_site":
std::vector<int> callees;
callees.push_back(cp_radiance->getId());

// Let OptiX know that these bindless callable program IDs can potentially be called from "my_call_site" inside the closest_hit_generic program object:
ch_generic->setCallsitePotentialCallees("my_call_site", callees);

That said, I would not use that mechanism when I can avoid it.
If you can make the bindless callable programs only calculate information which can be used after the return inside the closest hit program to do the necessary rtTrace with these information, that would speed up the bindless callable programs. My OptiX introduction examples do it this way.

nljones · February 18, 2019, 5:23pm

Thank you for the quick reply, Detlef.

I now get a compiler error:

error : no suitable constructor exists to convert from "int" to "optix::markedCallableProgramId<float3 ()>"

This occurs on the line:

prd.result = ((rtMarkedCallableProgramId<float3()>)mat_data.radiance_program_id, "my_call_site")();

droettger · February 18, 2019, 5:38pm

Sorry, wrong brackets.

rtMarkedCallableProgramId<float3()>(mat_data.radiance_program_id, "my_call_site")();

nljones · February 25, 2019, 6:37am

I’ve marked this resolved, but I’ll add a note as I found this rather tricky. It’s important that the method signature be an exact match, including all const qualifiers. Otherwise, the cudaDriver().CuEventSynchronize( m_event ) error may occur at any place where the code branches (usually at if statements or rtTrace calls).

In my working solution, I have the following:

RT_PROGRAM void closest_hit_generic()
{
	if (mat_id >= material_data.size()) return;
	MaterialData mat_data = material_data[mat_id];

	prd = rtMarkedCallableProgramId<PerRayData_radiance(MaterialData const&, PerRayData_radiance)>(mat_data.radiance_program_id, "my_call_site")(mat_data, prd);
}

RT_CALLABLE_PROGRAM PerRayData_radiance closest_hit_radiance(MaterialData const&mat_data, PerRayData_radiance prd)
{
	// The material intersection routines, including rtTrace calls go here ...
	return prd;
}


// On the host:
Program ch_generic  = context->createProgramFromPTXString(ptx, "closest_hit_generic");
Program cp_radiance = context->createProgramFromPTXString(ptx, "closest_hit_radiance");

// Let OptiX know that these bindless callable program IDs can potentially be called from "my_call_site" inside the closest_hit_generic program object:
ch_generic->setCallsitePotentialCallees("my_call_site", callees);

droettger · February 25, 2019, 10:01am

That would be always required for all bindless callable program signatures.

Really, I still recommend to avoid this functionality if you can. It’s meant for special material system implementations and should only be used if there is no other implementation possible.
This functionality doesn’t come for free inside the compilation step and at runtime.

Topic		Replies	Views
Can optix 3.9.1 version work with cuda 9 or 10 OptiX	13	950	June 14, 2022
InvalidAddressSpace when using pointer from Continuation Callable parameters OptiX	15	3079	September 27, 2021
Tex3D Optix 7 OptiX	7	1004	June 14, 2022
rtTrace only results in miss program invocation OptiX	3	1406	June 14, 2022
Access violation when creating program from PTX file OptiX	5	1267	June 14, 2022
Too many shadow rays generate null pointer execution on CPU in Optix 6.0 OptiX	8	558	June 14, 2022
Porting APP from Optix 3.8 (32 bit) to Optix 6.5 (64 bit) : Need some help, please OptiX	54	2066	June 15, 2022
rtContextLaunch1D: unknown error OptiX	9	1781	June 14, 2022
Optix device code returning ptxas error OptiX	5	1221	June 14, 2022
incompatible declaration with callable functions OptiX	5	1997	June 14, 2022

[resolved] rtTrace from bindless callable programs

Related topics