Thanks for your reply!
Well the access pattern will look sth. like this:
the first two std::vectors are like the x/y-coordinates of a point
the std:llist is a list of neighbors to the current point
and in the struct there are several properties of each neighbor
At the moment, the calculation is done for each point after another. I want to do that in parallel with CUDA.
So… each thread should access a point and do some calculations for all the neighbors of that particular point.
I hope this helps.
But how to represent the data in CUDA?
Should i have an array of structs with an array of structs and so on?
I can’t see another option at that point. Good to hear, that the “golden rule” isn’t that important on Fermi, because I’m on that platform.
Sth. like this, for example:
unsigned int index;
Doesn’t look very satisfacting…
The other thing is, I have to figure out, how much mem to allocate for each struct, because not every point has the exact same number of neighbors… well I think I’ll have to pad here…
Any (better) ideas?