Best practice... Voxels, use triangles or custom geometry? (For calculations, not rendering)

I have to build what is essentially a voxel engine, just smaller.

I have to fill a volume with few hundred units length on each axis with cubes, then shoot lots of rays from some points around it and then see which cubes they hit and what they hit on the other side.

My plan is to have an any_hit on the cubes that record each ray hit (there is also some calculated value attached to each ray that I then need to record per cube) and any_hit for the surrounding areas (upon exit of the cubes cube there are only 2 options for what it might hit)

Now my question is if I’d be better off using triangles for those cubes, or if I can use a custom geometry, like in the sphere example. Or rather, which would give me better performance? All in all this will be a few billion rays.

This is optix 6 btw

We can’t say what will be best performance without a lot more detail, or maybe after trying it, but I’m happy to discuss it and try to brainstorm some guidelines and suggestions.

Do you need to intersect geometry inside the voxels, or tally the voxel list, or do some non-geometric compute for each voxel… what kind of processing needs to happen per voxel exactly? Will you elaborate a little on what you mean by “what they hit on the other side” – do you mean after exiting your voxel structure completely?

With RTX hardware we generally recommend using a closest hit shader in combination with ray flags RT_RAY_FLAG_TERMINATE_ON_FIRST_HIT and RT_RAY_FLAG_DISABLE_ANYHIT. This is normally a faster alternative to using an any hit shader, but you have to re-launch your ray every time it hits something (ideally from your raygen shader, not from your closest hit shader). Another consideration with any hit is that traversal order is not guaranteed, and will in practice be out of order for nearby primitives. If you need to accumulate one or more values along the ray that depend on the previous values – for example if you’re doing some kind of transparency through your voxel grid – then any hit shaders will be complicated. If don’t care about traversal order, then it might be possible that using anyhit with a single ray going through a voxel grid (i.e. when expecting hundreds of hits in a row) might be as efficient as closest hit with multiple re-launched rays.

If you are only accessing or using voxel data inside the grid, and not surface geometry, or if you only need to collect your list of voxels a ray touches but you don’t need to process the voxels, then I might advise exploring a CUDA based ray marching renderer rather than trying to make voxel processing work with OptiX. While it might work, OptiX wasn’t designed for voxel processing, so it may be tricky and/or result in less than ideal performance.

If you really do want to use OptiX, then there are a few ways you might proceed. A straightforward approach would be to make a custom voxel primitive with an intersection program that tests the ray against the voxel’s AABB. This would require making an explicit list of the AABBs for all voxels, which means a lot of memory usage for a dense grid. (Did you mean you want to use a dense grid with several hundred boxes along each axis? Or are your boxes larger than 1 unit?) If you’re doing compute on the voxels and not rendering, then you can put your compute directly in the intersection program after the ray hits the AABB, or in a closest hit shader, or anyhit shader.

If memory is going to be a large concern, then a different approach might be to use large quads to separate each slice of voxels. You could use the triangle api for the geometry, and then in your hit shader you would need to do a little bit of computation to figure out which voxel the ray is entering. Using geometry for the slices means you’d only need 3*(n+1) quads, where n is the number of voxels along your axis, rather than needing n^3 bounding boxes if you make a custom voxel primitive… in other words you could choose to trade a little bit of extra computation for a lot of memory savings.

Does that help you see the path forward any clearer?


Hi David,

thanks a lot. Let me clarify then :)

I need to shoot from A to B, both outside of the voxel structure and B being unknown in the beginning (which probably is just a hit or miss) and A having some float value. I then need to find all voxel between A and B and store that information per voxel (not per ray though). Probably just adding up the value for hit/miss and a counter for hit/miss (to average it out). The end result should be per cube one float value per hit/miss.

But there needs to be some backtracking, or I recast from B to A but then I’d need some sort of sky dome.

Where the voxels get hit or the structure in the end get’s hit doesn’t matter.

The surrounding is regular triangle geometry and shouldn’t intersect with the voxel structure.

About any hit vs. recasting. I know there is a lot of performance to pick up with recasting, but how to I handle cubes that are side be side? They could be touching, so either I recast from the center of the current cube and hit it’s own backside first or just outside at the exit point but then might miss the neighbour, if the epsilon is too small?

As for AABBs etc. what I start with is probably just two float3 to have the min/max of the structure as a whole and then calculate the voxel.
While everything outside is (large and complex) geometry, the whole voxel structure doesn’t have to exist, outside of the float values per position in the end… I hope this makes sense.

And yes, this might be a few hundred on each axis. The size may vary depending on the resolution needed.

I could change this around, for example, and actually always align the cubes along some axis.

So far I haven’t seen memory as an issue and usually go with calculating as little as possible. But I hear you about just using one big triangle structure. I’ll have a think on that as well

Yes, that is true. You will need to use a small t_min value to prevent self-intersections, just like you would with shadow rays, for example. That could lead to a small percentage of your rays missing voxels when you use closest hit, that would have been hit if you used an any hit shader.

I would recommend this approach if you wanted to use the voxel wall approach with 3n quads (meaning, if it wasn’t already clear, you’d make a set of yz axis planes, a set of xz planes, and a set for xy, each extending to the outer walls of your voxel grid.) Since this does not have a voxel primitive explicitly, you would need to calculate which voxel you’re in every time a hit shader is called. To make it fast and easy, I’d suggest creating an axis aligned grid where the quads land on integer values starting from 0. You can then use an instance transform to position it however you like for ray tracing, but in your closest hit shader you would get the object space coordinates of the hit and then simply truncate the float coordinates to int in order to compute your voxel id.

This leaves me wondering whether it might be possible & better to avoid ray tracing separate voxels. One option for you might be to have a single ‘voxel grid’ primitive - a single cube - and in your hit shader (or CUDA kernel), you calculate all the voxels that a ray intersects. This is relatively simple to do in code, it’s a line drawing algorithm in 3d. If your needs would be met this way, it will ultimately be much faster than ray tracing, and also allow you to completely avoid the precision issues that recasting might have, if you have some strict tolerance requirements.

I guess my first recommendation is to figure out whether you really need OptiX, because it sounds like you might be able to use a more voxel-tailored approach. If that’s true, I speculate it will be both easier to develop and result in a higher performance application to do at least the voxel part manually.


Thanks David,

yes, the planes bit makes perfect sense. I think I’ll explore this as well. This is a great idea!

I am working from Unity but I have the whole pipeline setup and running for a few other, similar issues (just without the voxel bit). At least to figure out that point B I’ll need proper raytracing. But I’ll keep CUDA in mind. This might actually be something once we get to optimization.

Again, thanks a lot!