Question about Cuda2 floating contex Context switch in host threads

Nice to see the ‘threadMigration’ bug in beta2 of ( has been resolved in the release. Now for switching the context between host threads seems work fine, but I have another question about the same situation of floating context - When I switch the context from one host thread A to another host thread B, will the cudaContext wait for all operations finished before proceeding further in thread B?
Many thanks in advance. :)

I see no reason why it wouldn’t–it’s moving the context, including all streams, so whatever kernels are running when the context is moved should complete normally before any other kernels are run on the stream.

Got it,Thx