I thought it must be possible to reuse a floating CUDA context in a host-multithreading application several times ?
I had a look at the example “threadMigration” of CUDA2.0beta2. This made me think that something like the following should be possible (very simplistic):
host thread: create a context ‘ctx’ cuCtxCreate, cuCtxPopCurrent(NULL);
start thread 1; cuCtxPushCurrent( ctx); do something; cuCtxPopCurrent(NULL);
host thread: wait for thread 1 to finish
start thread 2; cuCtxPushCurrent( ctx); do something; cuCtxPopCurrent(NULL);
host thread: wait for thread 2 to finish
But when I try this, I get a CUDA_ERROR_INVALID_VALUE return value on the second call to cuCtxPushCurrent( ctx).
I did not call cuCtxDestroy or cuCtxDetach. (I don’t know whther this might be important, but in “so something” I use the cuFFT + my own code) … Any ideas ???