Is it possible that an error in allocating/de-allocating memory in the device may be picked up by a native cpu host memory checker?
The code in question now has a cpu version of a bunch of linear algebra functions and a corresponding gpu version(if it does not already exist in CUBLAS). I do this to check the timing differences between things like cholesky factorization etc. I wrote all the cpu code, and for the device wrote maybe 8 kernels to do the things which CUBLAS does not offer. The answers from both are within 6 decimal places of each other so that is not the problem.
The funny thing is that no cpu memory errors are found during the cpu run(even though I wrote it C style with malloc/free), but when I run the gpu stuff I get notice of a cpu memory leak, and the line of code it refers to is one in a GPU kernel, which is just the declaration of a small shared float array of fixed size.
The memory leak is small (40 bytes) but I have seen no memory errors from the GPU side so I am a bit confused as to the source of the leak.
I am using <crtdbg.h> in Visual Studio 2010 x64 and that is what alerts me to the leak details.