Hello,
I observe a systematic performance degradation when optixTrace() is simply replaced with optixTraverse(); optixInvoke() calls (no optixReorder() used). The difference gets bigger when more complex and multiple shaders are involved, Is that expected or I should pay attention to something else when trying to implement the new approach?
The only change I am making in the code is:
optixTrace(
handle,
ray_origin,
ray_direction,
tmin,
tmax,
0.0f,
OptixVisibilityMask(1),
OPTIX_RAY_FLAG_NONE,
RAY_TYPE_RADIANCE,
RAY_TYPE_COUNT,
RAY_TYPE_RADIANCE,
u0, u1, g0, g1);
replaced with:
optixTraverse(
handle,
ray_origin,
ray_direction,
tmin,
tmax,
0.0f,
OptixVisibilityMask(1),
OPTIX_RAY_FLAG_NONE,
RAY_TYPE_RADIANCE,
RAY_TYPE_COUNT,
RAY_TYPE_RADIANCE,
u0, u1, g0, g1);
optixInvoke(u0, u1, g0, g1);
Code is running on rtx4090 and driver 552.12.