I am currently working on a voxel engine, and I have ran into a specific DXR behavior (I suppose) that I couldn’t find enough materials to figure out.
Background: I am RTX-ing voxels using AABB procedural primitives, with each AABB representing a 8x8x8 voxel brick. For the ease of development I am currently working on Unity 6, but I guess the behavior is rather general.
So the problem I have so far is about register counts, especially I observe a significant bump in live registers at ReportHit() calls in my intersection shader:
As you can see, the live reg count bumps from 39 to a whopping 84 (39 is still quite high). This results to a low occupancy and I suspect that it can be further optimized, if not significantly. I have no clue about what is happening here, neither I know not how can I reduce that. I have tried to find documents online but I cannot find much relevant things, so any inputs are appreciated.
To give more context, my intersection shader is quite complex with some DDA-ish voxel ray marching computations. The closest hit is rather simple, and there are no anyhit involved. I have a 20Bytes payload and an attribute containing a single uint. My ray generation shader handles bounces etc., and is quite complexed too. Also, the ReportHit call happens at the last line of my intersection shader, where the only need-to-alive variable is that one uint in attrib (and T), anything else could possibly be discarded – so I am more confused why does live reg count grows so much here.
I did some experiments by disabling the entire voxel DDA thing, and let the intersection shader immediately `ReportHit` no matter what. This reduces live reg count to 33:
However, the occupancy is still very low, perhaps due to my other shaders still being complex. In “Shader Pipelines” window, my raygen, closesthit and intersection all gets #Reg = 91 and #Warp = 20. (I cannot attach a third image…)
I wanted to know:
-
By reducing live reg counts in
TraceBrickPrimitive(main workload of intersection shader, typical live reg count is around 60), can I lower the spike atReportHit? (I have optimized it from ~100 to 84, but I don’t really know why; I’d suppose those variables are all dead before I ever callReportHit) -
Is that related to the complexity of other shaders (esp. raygen)? If so, maybe I should optimize raygen also to improve occupancy.
-
Are there any documents related to this? I can barely find any except tips like “reduce your payload size”.
Thank you for reading this far, and I wish you a good day.

