CuBLAS/DNN with managed memory

Hi there!

Just want to confirm, if I’m calling a cuda library function, like cublassgemm, or cudnnConvolution, the data the operand pointer refers to doesn’t have to fit in the device memory?

(Assuming I have managed memory support, Turing arch)
For example, if I call cublasSgemm(A,B,C), and I have 4GB device memory> A B and C can all be larger than 4GB?
The compute kernel will know to swap pages in/out during computation and the result will be correct? The only price to pay is computation time?

I assume managed memory is a full virtual memory system.

Thanks for your help!

Yes, generally speaking, that is true.