I have some questions about stream handling in cuBLAS, mostly related to pointerMode_t. I wasn’t able to figure out the answers from the docs.
The most important question is: can I use multiple streams to queue asynchronous calls using host pointers?
For example, can I do this safely:
setPointerMode(h, HOST); setStream(h, stream2); cublasrotg(h, &a, &b, &c, &s); cublasrot(h, n, x, y, &c, &s);
When I do it, then will cublasrot use values of c and s calculated in cublasrotg or is it not guaranteed?
If it’s possible, then… how? This would mean kernels are somehow able to read host memory, therefore cublasrot will either need to issue a host->dev copy or alocate pinned memory. Both operations are implicit synchronizations (), and that would defeat the whole idea of cuBLAS being asynchronous in HOST_PTR mode.
Another, simple question: can I use 2 library handles to simplify using separate streams?
Thanks in advance