MLP Evaluation in Closest Hit

droettger · March 15, 2024, 7:28am

If you have a small fixed amount of data you need to process, there shouldn’t be an issue with defining a fixed local array with that and do your calculations on it.
It’s a matter of OptiX stack space. If that is inside a recursive algorithm, that could result in quite a stack size increase. Always calculate the OptiX stack size explicitly in your application. (Search the forum for that.)

I don’t think Tensor operator instructions are supported by OptiX device code today.

If you need the full CUDA functionality (shared memory, tensor operators, etc.) for the MLP operations, it might make sense to implement a wavefront renderer where ray generation and shading is handled inside native CUDA kernels.

Described here:
https://forums.developer.nvidia.com/t/deep-learning-and-optix/112722/2
https://forums.developer.nvidia.com/t/lack-of-support-for-threadfence-in-optix-ir/269353/4

You cannot control what variable is stored inside registers. That’s done by the final assembler generating the microcode.
You can change the number of registers your pipeline should use with OptixModuleCompileOptions maxRegisterCount.
You normally set that to OPTIX_COMPILE_DEFAULT_MAX_REGISTER_COUNT (== 0) which let’s OptiX decide and that is using 128 registers today but you can try changing that. The maximum is 255 (one register is reserved).
Using more registers can reduce spilling but it also affects how much work can be launched in parallel. Always benchmark that. YMMV.

Topic		Replies	Views
Help me to understand Global vs Local Memory performance. CUDA Programming and Performance	19	24626	December 21, 2009
TLP and ILP optimization CUDA Programming and Performance	4	1458	November 23, 2014
Register and local mem problems 3D optical flow algorithm is too large ? CUDA Programming and Performance	1	5172	May 9, 2011
Register usage in OptiX 6.0 kernels OptiX	2	621	June 14, 2022
Optix-low computational usage on GPU OptiX	12	946	June 22, 2022
The optimization options in nvcc have resulted in increased register pressure CUDA Programming and Performance cuda	8	90	December 13, 2024
Use of warp-level primitives OptiX	2	581	December 29, 2022
Maximum optimization settings CUDA Programming and Performance	7	6921	June 21, 2008
Any undocumented queries? CUDA Programming and Performance	6	5234	October 5, 2008
Register Usage of my program To optimize scheduling of my program CUDA Programming and Performance	5	5899	December 15, 2007

MLP Evaluation in Closest Hit

Related topics