got a really simple question here: Does the cuBLAS-library support zero-copy-like parameters?
A simple example would be
double *a, *aT; cudaHostAlloc(&a, sizeof(double) * m * n, cudaHostAllocMapped); cudaHostAlloc(&aT, sizeof(double) * m * n, cudaHostAllocMapped); //FILL a ... //a -> aT double one = 1.0; double zero = 0.0; cublasDgeam(..., &one, a, m, &zero, a, m, aT, n);
I tried to do this, but my aT-matrix consists of just zeros and i´m wondering, if this is just because of Windows7 in non-TCC mode. This doesn´t even work when using device-pointer delivered by cudaPointerGetAttributes()