I was wondering if anybody else had experienced an issue where calling CudaFree would result in freezing the program execution…? It just stops and hangs, if I run the program within the compiler and hit the break execution button, it appears to be doing something, but I’m not sure what, it never appears to exit the CudaFree function.
I’ve tried googling many times over and not really found a similar issue, I wonder if this was an error specific to my environment setup in some way, a corrupted DLL file, or even perhaps some memory alignment problems.
The program is written in Visual Studio 2013, C++, Cuda Toolkit v7.5, multi-threaded, 64-bit. The issue only appears to show itself whilst in release mode, or running the .exe directly from the OS. I’ve commented out most of the code to it’s most simple form without being able to track down the problem. The problem doesn’t appear to be that consistent!
If I declare the memory using CudaAlloc and avoid running any Kernel, CudaFree works fine, but running even the most simple Kernel may cause the CudaFree issue.
The program in it’s complete form has been running nicely, it’s just when it gets to the cleanup() functions it has trouble!
You could try adding error checking macro like this, it is easier for you to detect which line causes the problem. Sometimes, the error might relate to other statement in the program:
Thanks for your reply. Thing is, I’m doing lots of error checking! the program doesn’t allow any kernel execution if anything else fails before hand, i.e. SetDevice/CudaMalloc.
I check for CudaSuccess on all my initialization operations.
In an effort to find the problem, I’ve commented most of the code out, effectively just running one kernel. If I don’t execute a kernel but still set the Cuda device and allocate the memory I don’t get the problem with CudaFree…it’s very odd.
I’m wondering if this is associated with how I shutdown the application, it does appear to be some sort of timing issue. sometimes (rarely) it shutdowns successfully, most of the time it doesn’t.
I have a couple of threads running calling Kernel functions, I have a sneaky suspicion this is related to killing the thread off unconditionally without being sure the the Kernel inside the thread has completed. I was just testing this tonight, inserted a lock around my execution & cleanup code, seemed to help making it behave better.
Best guess is that CudaFree waits for Kernel calls to be complete but because the kernel terminated ungracefully it will wait forever…?