I’ve an unstructured array of vertices (3D vertices as float4), that I want to sort in a uniform grid. As I’m new to the topic I was searching the web for viable approaches, but found many conflicting solutions with big variance in code complexity. I’m hoping to find someone to give me some guidance for what approaches are ill advised, have apparent bottlenecks or where to find documentation on how to implement them (maybe within the CUDA samples or links/papers/books). Feel free to add any additional approaches you may have.
Here are the approaches I’m currently considering:
- Using index structure (basically a lookup table) to access 1D array of vertices
(current implementation)
- map coordinates to 3D position of the correct cell within the grid
- flatten 3D grid position to 1D
- create an array of structs with reference data {lookup}: each struct contains a start position and a number of vertices
- copy the two arrays to the GPU
- on GPU: access vertices by checking current Cell, then looking up the range, then accessing the 1D array of vertices for the range
This is the first and in my opinion easiest solution, with some hiccups, mainly random/unpredictable accesses, bad readability.
-
Using an array of structs
I was considering to substitute the lookup table by creating an array of structs for the grid.
Each struct would have an integer storing the number of vertices and a dynamic array with the actual vertices. Instead of searching the lookup table one could (hopefully) use the gridcell as position within that array. From what I gathered online an array of structs with dynamic memory should be possible, but I doubt it as I haven’t found any similar solutions.
Plus: Easier to read, no need for a separate lookup table, close vertices in sequential memory
Minus: Behaviour with empty cells, memory allocation more complicated -
Using cuda Arrays
This is where I found a multitude of different answers.
- Would a (flattened) 1D cuda array be any different than my 1st approach?
- Would a 3D cuda array be more suitable considering the 3D nature of my data?
- Do I have to use 1D/3D textures in combination with cuda arrays or would they only help with performance?
- Is there a difference in maximum allowed memory between a cuda array and a texture?
- how can I access parts (a gridcell) within a 3D cuda array?
- (sorting for 3D cuda array in approach 4)
- using space fitting curves
This was mentioned in another CUDA forum with barely any more information. The idea would be to use an algorithm to map the vertices’ positions to a 3D cuda array by following this curve, for example using a hilbert curve.
Assuming that I don’t care about the order of vertices within a given gridcell:
- Would this approach still offer a potential speedup?
- Would I have to set conditions to the grid dimensions for the curve to work?
- What level of complexity (approx.) would I have to anticipate?
I hope this topic isn’t completely out of place here; otherwise feel free to close it and point me to the right forum / platform.