Sharing CUDA context between threads ?

I have one thread (A) running CPU stuff, and one other (B ) dealing with CUDA stuff (so CUDA is initialized in B ).

At some point, thread A need to know if thread B has finished an asynchronous device to host memcopy, and I have other things going on in thread B so I don’t want B to check that for A.

How can A check the end of async copy ? If I try a cuStreamQuery from A, I get a CUDA_ERROR_INVALID_CONTEXT error.