I have a few questions about multi GPU programming.
-
What is equivalent to “cudaSetDevice()” in driver mode?
-
What’s the overhead of calling “cuCtxSetContext”? How expensive is it?
-
When page-locked mapped memory is created, can all GPU
devices access it directly or only the device which allocated it?
Thanks.