Debugging memory issues

Hey guys,

I’ve got a problem with allocating device memory on the gpu. I’ve got a binary tree-structure with a root node pointer and 2 pointers to child nodes in every node. Somehow I get incorrect results and I think its because the memory wasn’t allocated or copied right. Is there a chance to debug this and to observe device variables to find the mem bug?
In emulation mode there is no chance to see where the pointer points to on the device.

By the way: is it possible to pass a tree-structure to the device???

Thanks in advance.

Daniel

You have at least three options:

  1. Compile for device emulation and then run the emulated program on valgrind.
  2. Use the cuda implementation of GDB.
  3. Or run on Ocelot ( http://code.google.com/p/gpuocelot/ ) which is a GPU emulator that detects memory errors like valgrind. (I helped write this so my opinion here is biased)

You also might want to consider using a wrapper library like thrust ( http://code.google.com/p/thrust/ ) for allocating memory (make everything a thrust::vector). Thrust allows you to easily access elements from vectors stored in device memory in host code, which should allow you to inspect your data structures after each update.

And yes it is possible to pass tree structures to the device though it is beneficial to lay them out in a sequential memory block so that you can easily copy them to the device. There are some typical layout strategies for trees for example where you lay out the data as a sorted vector and then use a recursive binary search to determine indices into the array. This ensures that all of the data is stored contiguously and as long as you don’t modify the data structure very often it has very good spatial locality properties.