Help with: Fundamentals of Accelerated Computing with CUDA C/C++


I just wrapped up the meat of the CUDA class “Fundamentals of Accelerated Computed with CUDA C/C++”, enjoyed it, and I pretty much followed it. But then I got to the final and there’s some syntax that I’m really struggling with. Full disclosure, I’m a neophyte C/C++ programmer.

int bytes = nBodies * sizeof(Body);
float *buf;
buf = (float *)malloc(bytes);
Body *p = (Body * ) buf;

What in the world is happening in the last two lines of code? “Body” is defined type that’s a struct with 6 floating point members.

I’ve been staring at this for hours. Thanks for any guidance!

From the snippet it’s not clear why the code was written this way. We want to allocate an array of nBodies elements of type Body. The allocation function malloc requires us to pass the number of bytes we want, and returns a generic pointer (void*) to the allocated memory if successful. In order to access this memory as an array of Body elements, we need a pointer for this type, that is, a Body *.

int bytes = nBodies * sizeof (Body); // number of bytes we want to allocate
Body *p = (Body *)malloc(bytes); // convert generic pointer to pointer of desired type

Now we can refer to the Body elements of our array as p[0], p[1], and so on. buf in the original code apparently serves to re-interpret the array of Body elements as an array of float elements (for some reason that isn’t clear here). So we could add this:

float *buf = (float *)p;

Now buf[0] through buf[5] give us access to the six floats of p[0], buf[6] through buf[11] give us access to the six floats comprising p[1], and so on.

Thank you very much. Not sure I’d have ever figured that out.