Transfer heap (dynamically allocated memory) back to host

galiciandev · January 24, 2011, 10:23pm

Now we can dynamically (de)allocate memory in the device code using malloc/free. However, I cannot find how to copy that heap BACK to the host. Any hints on which API call does the trick?

DrAnderson42 · January 25, 2011, 1:28pm

cudaMemcpy? Just copy the pointer back, then use it with cudaMemcpy. I’ve never tried it, but it should work.

galiciandev · January 25, 2011, 5:20pm

I understand, but what I’m really asking is how to copy the WHOLE heap back to the host. In order to use cudaMemCpy, I need a start address and the size of the memory block. I already know the size of the heap (it is possible to configure it), but I’m missing the start address.

Doing a memCpy for every pointer returned by malloc would work, but it would also be too inefficient.

SPWorley · January 25, 2011, 9:30pm

You’re making assumptions that the device heap is all contiguous in addresses. This is almost certainly true, but it’s still one of those dangerous assumptions that can break things when you use hacks that suddenly no longer work when the devices or drivers change.

If it’s really so necessary to transfer dynamically malloced device memory over to the host, a safe way is to do the mallocs yourself by allocating a static big block, then using atomic increments to manually suballocate from that block at need.
This doesn’t give you the ability to free memory (easily) but it’s quite versatile and portable and fast. And then the single-block memcopy assumption WILL be true.

In fact you can even use this kind of manual allocation with zero-copy memory, avoiding the whole step of manually copying back to the host.