I’m working on a simple ray tracer using CUDA and I’ve come across this paper detailing an algorithm to construct an uniform grid on the GPU. I’ve done half of it, but now I’m stuck, and there’s no reference code to inspect.

Basically, what the algorithm does is this :

- For each primitive, count the number of cells it overlaps.
- Accumulate the values in Step 1 to compute bueffer start indices. These are used in Step 3 to write several bueffers in parallel.
- Write all cells each primitive overlaps using pairs (cell ID, primitive ID).
- Sort the pairs in Step 3 according to their cell IDs.
- For each cell ID, perform a binary search in the sorted grid data to nd bueer start indices and their sizes.

I’m stuck on step 3, especially the cell ID, which is explained like this :

The cell ID value, on the other hand, can be computed by using a local oeffset: the difference between the memory position in the resulting grid and the start position of the primitive list (the lower bound value). This local offeset indicates the nth overlapped cell ID must be written in the nth position in the primitive list. We establish a convention of ordering grid cells in ascending linear IDs in X, Y and then Z order. This allows for converting linear cell IDs to 3D cell IDs.

I’m a bit clueless on what this means. From what I understand, it’s impossible to just calculate the cell ID. Maybe I need to hittest again with my primitive?

I’ve contacted the authors, but this paper is relatively old (2009) and there’s a chance they do not respond.

Can anyone help me understand this step?