Does anybody know the difference between using the CUBLAS alloc/copy functions and the standard CUDA ones?
Is there a difference between doing:
cublasAlloc(....)
cublasSetMatrix(...)
....
cublasGetMatrix(...)
and
cudaMalloc(...)
cudaMemcpyHostToDevice(....)
....
cudaMemcpyDeviceToHost(...)
The documentation states that cublasAlloc is simply a wrapper for cudaMalloc, but it doesn’t say anything about the SetMatrix/SetVector/GetMatrix/GetVector functions.