Hello. I’m trying to make multi-host threading & multi-CUDA stream program.
the pipeline what I want is described below :
I want to escape the running host code when D2H in the main thread ends.
So I called cuda synchronization functions, especially cudaStreamSynchronize(stream 1)
but this function blocks not only stream 1, but also stream 2.
My question is, cuda synchronization function block whole host code regardless of threads?