I am trying to split some computations over two gpu’s, so I want to use two streams (one per gpu).
I grasp the basic idea of streams, my problem is when I add texture memory into the mix. Is there any stream related asynchronous binding version for cudaBindTextureToArray? Or it will work regardless?
I do know that the best way is always to “play with it” and see what happens, however when the code is long enough and you start to have errors, is nice to know that they are not all coming from here…
Streams are intended for asynchronous operations within a single context (ie. on a single GPU). They don’t really have anything to do with multiple GPU programming, because the CUDA model uses a different context per GPU.
Right, but the only part of that code you have posted which is specifically related to multi-gpu is the cudaSetDevice call. The rest is only about asynchronous operations within a single GPU context. If you have a texture, it will be defined in each context you initialize on each GPU. Think of anything after a cudaSetDevice call as occurring in the scope of a given context. This includes global memory symbols, textures, kernels and anything which has context level scope.