The splitPointer
and mergePointer
operations are only there to get a 64 bit pointer value in to and out of 2 32-bit payload slots, nothing more. Thereâs no reason to customize those operations and duplicate the code for different structures, unless you really want to. The analogous functions in the optixPathtracer
sample are slightly more generic, agnostic to the payload structure type, but that means you have to do a little type-casting in your code. Since Detlefâs version is a union of a specific type, heâs avoiding the type casting, but hard-coding the type. Below is the pointer packing code from the optixPathtracer
sample.
But â to emphasize what I said earlier, what you might want to do is pack your entire structure into the payload slots, rather than packing a pointer to your structure. Think about this as being conceptually similar to passing your struct by value instead of passing by reference. The reason to do this is because it can be faster than using a pointer (just like accessing a single int passed directly into a function can be faster than passing and then de-referencing a pointer to an int). Which way is faster for you needs to be tested & measured, just be aware that using the payload slots in this âpass by valueâ fashion is easier for us to optimize, so potentially faster for you. In that case, you would indeed need to use a different packing function for each payload struct type (if you want to write packing functions). In your closest hit, you can unpack the struct locally, which might get optimized away if youâre lucky, or alternatively you could set the individual items directly using calls to optixSetPayload_0,1,2,...
. Setting the individual items directly has the potential to be faster than unpacking your struct into memory, but itâs a little more work to manage a shader that is written that way.
static __forceinline__ __device__ void* unpackPointer( unsigned int i0, unsigned int i1 )
{
const unsigned long long uptr = static_cast<unsigned long long>( i0 ) << 32 | i1;
void* ptr = reinterpret_cast<void*>( uptr );
return ptr;
}
static __forceinline__ __device__ void packPointer( void* ptr, unsigned int& i0, unsigned int& i1 )
{
const unsigned long long uptr = reinterpret_cast<unsigned long long>( ptr );
i0 = uptr >> 32;
i1 = uptr & 0x00000000ffffffff;
}
static __forceinline__ __device__ RadiancePRD* getPRD()
{
const unsigned int u0 = optixGetPayload_0();
const unsigned int u1 = optixGetPayload_1();
return reinterpret_cast<RadiancePRD*>( unpackPointer( u0, u1 ) );
}
static __forceinline__ __device__ void traceRadiance( ..., RadiancePRD* prd )
{
unsigned int u0, u1;
packPointer( prd, u0, u1 );
optixTrace( ..., u0, u1 );
}
â
David