Greetings! As the title suggests, I am using OptiX 6.0, GeForce RTX 2060, 436.30 driver.
I’ve been tasked to optimize our OptiX kernels. Nsight says that our GPU utilization is fairly low, so I wanted to approach that from a few angles. Particularly, I’ve noticed that our kernels declare a host of local variables, so I figured that low GPU utilization may partially be due to the shortage of registers. Disclaimer: I wasn’t able to verify that with Nsight, because I couldn’t find our OptiX kernel in the list of all kernels. I remember that with Optix 5.0 it was labeled with “MegaKernelN”, but I am not sure about Optix 6.0.
p.16 at http://on-demand.gputechconf.com/gtc/2013/presentations/S3475-Ray-Tracking-With-OptiX.pdf states that “when working set of registers is too large, registers are stored to local memory”. Does that mean that when OptiX kernels are compiled, OptiX automatically performs this optimization? Or do you think that moving the aforementioned plethora of local variables to some local memory could alleviate the issue of having a shortage of registers (I assume not, if OptiX automatically moves them to local memory, if I understood that correctly)?
I would greatly appreciate pointers in the right direction. Thank you for your time!