cudaThreadExit() cleanup Does it work?


The function “cudaThreadExit” does NOT seem to work for bad kernels.

Kindly look @ Sadhana’s post (code) here:

In 2.1 , cudaThreadExit() just does NOT work.

In 2.2, cudaThreadExit() works fine only if you “cudaFree” your earlier cudaMalloc()s after a launch failure (launch timeout due to while(1)).
Otherwise, subsequent cudaMalloc() fails and the context is un-usable.

Can some1 explain what does this function actually do? and when should we use it?
Is it a cool way of releasing all your “cudaMalloc” in one-shot?
Does it really help in error-recovery (bad address, Launch timeout)?

Appreciate any answer,


Best Regards,

Any info on this? (especially from NVIDIA)

Go up, up …

Now that Tim is browsing, let me up it…

Nothing more for me to say about the killer kernel thing at the moment, except that cudaThreadExit destroys a context and all its associated state. If that doesn’t fix it, file a bug (I don’t have time to track all of these things down if it requires me to hunt for additional hardware).

(it might help if I knew how things were failing after cudaThreadExit()–that’s sounds suspiciously like a pretty minimal bug in CUDART)

This is not about the killer kernel. This is just about the behaviour of “cudaThreadExit()” (which is not documented in the 2.2 programing guide)

Does a context destruction mean that it cannot be used any further? ( like can’t I do a cudaMalloc() after cudaThreadExit() )

Basically, I want to use it as a “cleanup” function after a kernel error. Was “cudaThreadExit” designed for that?

uh, it’s certainly in the reference manual. all it does it kill the context. further cuda* calls will reinitialize the context if they require a context to exist in the first place, you can call cudaSetDevice(n) after cudaThreadExit() without issue, etc.