We can’t say what will be best performance without a lot more detail, or maybe after trying it, but I’m happy to discuss it and try to brainstorm some guidelines and suggestions.
Do you need to intersect geometry inside the voxels, or tally the voxel list, or do some non-geometric compute for each voxel… what kind of processing needs to happen per voxel exactly? Will you elaborate a little on what you mean by “what they hit on the other side” – do you mean after exiting your voxel structure completely?
With RTX hardware we generally recommend using a closest hit shader in combination with ray flags RT_RAY_FLAG_TERMINATE_ON_FIRST_HIT and RT_RAY_FLAG_DISABLE_ANYHIT. This is normally a faster alternative to using an any hit shader, but you have to re-launch your ray every time it hits something (ideally from your raygen shader, not from your closest hit shader). Another consideration with any hit is that traversal order is not guaranteed, and will in practice be out of order for nearby primitives. If you need to accumulate one or more values along the ray that depend on the previous values – for example if you’re doing some kind of transparency through your voxel grid – then any hit shaders will be complicated. If don’t care about traversal order, then it might be possible that using anyhit with a single ray going through a voxel grid (i.e. when expecting hundreds of hits in a row) might be as efficient as closest hit with multiple re-launched rays.
If you are only accessing or using voxel data inside the grid, and not surface geometry, or if you only need to collect your list of voxels a ray touches but you don’t need to process the voxels, then I might advise exploring a CUDA based ray marching renderer rather than trying to make voxel processing work with OptiX. While it might work, OptiX wasn’t designed for voxel processing, so it may be tricky and/or result in less than ideal performance.
If you really do want to use OptiX, then there are a few ways you might proceed. A straightforward approach would be to make a custom voxel primitive with an intersection program that tests the ray against the voxel’s AABB. This would require making an explicit list of the AABBs for all voxels, which means a lot of memory usage for a dense grid. (Did you mean you want to use a dense grid with several hundred boxes along each axis? Or are your boxes larger than 1 unit?) If you’re doing compute on the voxels and not rendering, then you can put your compute directly in the intersection program after the ray hits the AABB, or in a closest hit shader, or anyhit shader.
If memory is going to be a large concern, then a different approach might be to use large quads to separate each slice of voxels. You could use the triangle api for the geometry, and then in your hit shader you would need to do a little bit of computation to figure out which voxel the ray is entering. Using geometry for the slices means you’d only need 3*(n+1) quads, where n is the number of voxels along your axis, rather than needing n^3 bounding boxes if you make a custom voxel primitive… in other words you could choose to trade a little bit of extra computation for a lot of memory savings.
Does that help you see the path forward any clearer?