very strange memory issue on device

Hi I have some very strange memory issues on the device, what I’m doing:

cudaMalloc a struct.
cudaMemcpy host to device to initialize the struct
passing the pointer returned by cudaMalloc to the struct kernel
printf the contents of the struct inside the kernel
returning to the host

A number of structs are copied and passed, and a couple of arrays are as well. However, I’m doing nothing in this kernel except printing the arguments that got passed into the kernel. I also don’t have any errors reported by cudaMalloc or cudaMemcpy.

For small debug size cuda-memcheck reports no errors and the kernel runs successfully. however when I run my code on real problem sized data I see that some of the threads print one of the struct members with a bad value. Just one member, other struct mmebers seem to be fine. And also the kernel doesn’t complete I get the ever so helpful error message : “Error: Process didn’t terminate successfully”. Often I can only run the code once and have to reboot the system! I know this seems as though I’m stomping memory, however I’ve double checked that the sizes I pass to cudamemcpy and cudaMalloc are OK.

what is going on here? Is it possible that I am using too much memory but cudaMalloc still succeeds? how would I determine the problem?

Have you used cuda-gdb with cuda memcheck on to verify there are no memory errors? It could also be an alignment problem.

Thanks for the reply. Yes have debugged the code and that’s why this was thoroughly frustrating.

My issue turns out to be not completely cleaning the build after switching from release to debug compile options in my cmake project. The code then builds fine but has some very strange run time behaviors! Initially I did not include CUDA_BUILD_CLEAN_TARGET() in my cmake lists which adds a dependency to the clean target for the generated cuda files. This in theory would have solved the problem. However, after including this I still have the problem when switching from a release to a debug build. Now I am using two builds one release and one debug and have not experienced the issue.