I’ve just started using a system with multiple Tesla cards. While most of my OptiX kernels seem to take advantage of both cards, I have one 1D kernel that only gets run on one card and eats up the majority of the computational time.
Am I right in assuming that the kernels get split between cards based on launchIndex.y, and so all of the threads in the 1D launch get forced to the first card because there is no y-dimension? Is there any way around this? For this particular kernel, there’s no logical way to partition the input into two dimensions.