Coalesced memory access


I’ve recently started writing my first CUDA code, so apologies if this is a bit of a noob question.

I have an NBody simulation of charged particles, each of which has a mass that I wrote for normal CPUs. It used 3-float struct arrays for the position and velocity data, and had an additional lookup array that held references to the particle properties, which were stored as follows:

struct ParticleDefinition {
float mass, charge, radius;
uint32 colourARGB32;

My initial idea was to use the built in float3 types for the position and velocity data, but having read up the example nBody example documentation, it’s clear that float3 would be suboptimal and that float4 would be a lot better due to coalesced memory access.

However, my simulation has no real use for an additional float w ordinate. It would, however, benefit from having a place to store the reference to a ParticleDefinition, since for each interaction you’ll need it in the calculation.

My question is, does the GPU care if this extra ordinate actually contained a value that wasn’t a floating point value but an integer, which in my case would be an index into an array of ParticleDefinition?

Also, would the compiler complain if this float were cast to that integer by (uint32)(&value.w) ?

Sorry if this question isn’t particularly clear, it’s 5 mins to hometime :)

So, to clarify the question, in essence, what I’m asking is: can you treat a float4 as a : struct { float x, y, z; uint32 w; } by casting the address of the last element (w) then dereferencing it?

Your question is very clear.

No, the GPU won’t care that the value is not a floating point value. Also, you can more conveniently (and more safely) convert to an integer with __float_as_int() and __int_as_float() (see the programming guide). Typecasting like you are doing is dangerous because some compilers (like gcc) with optimization on assume that pointers of different types cannot possibly point to the same data. I know of at least one program that this causes a severe crash in. Unions are the prefferred method (with __int_as_float() and __float_as_int() uses internally).

Thanks for that. Actually, I’m aware of the casting issue, especially in gcc 4.x, where if memory serves it’s illegal and it refuses to compile it. I stated it that way just to make it clear how I wanted to treat the data. I normally use anonymous unions for this sort of thing :)