Function cublasSetVector

Hi,

i’m using the cublas functions, the cublasSetVector and the cublasSdot among them, and though my program is simple i have results with no logic at all.

The guide says:

I don’t understand what incx and incy are. Are they the numbers of bytes? Since i have 2 floats vectors how much should they be?

I am pretty sure the errors i have are caused by that.

Thank you

incx and incy are strides measured in floats. For example, if you have a vector in the cpu memory defined as float cpuvector[n] and a matrix and a vector in the gpu memory defined as

float gpuvector, gpumatrix;
cublasAlloc( n, sizeof(float), (void
)&gpuvector );
cublasAlloc( n*n, sizeof(float), (void**)&gpumatrix );

Then

cublasSetVector( n, sizeof(float), cpuvector, 1, gpuvector, 1 ) copies cpuvector to gpuvector
cublasSetVector( n, sizeof(float), cpuvector, 1, gpumatrix, 1 ) copies cpuvector to the first column of gpumatrix
cublasSetVector( n, sizeof(float), cpuvector, 1, gpumatrix, n ) copies cpuvector to the first row of gpumatrix
cublasSdot( n, gpumatrix, 1, gpumatrix, n ) returns dot product of the first row and the first column of the matrix.

It’s same as with any other BLAS, say the one in the Intel MKL.

Thanks for your answer vvolkov External Image

I tried with that, but i still can’t make it work right…

float	MyGPUProdScal(float *CPUVect1, float *CPUVect2, int elementN){

float *GPUVect1, *GPUVect2, psGPU;

cublasInit();

cublasAlloc(elementN, sizeof(float), (void **) &GPUVect1);

cublasAlloc(elementN, sizeof(float), (void **) &GPUVect2);

//psGPU = CPUVect1[0];

cublasSetVector(elementN, sizeof(float), CPUVect1, 1, GPUVect1, 1);

cublasSetVector(elementN, sizeof(float), CPUVect2, 1, GPUVect2, 1);

//psGPU = GPUVect1[0];

psGPU = cublasSdot(elementN, GPUVect1, sizeof(float), GPUVect2, sizeof(float));

return(psGPU);

I noticed that something strange happens when I try to read what’s inside the vectors.

Actually the one who “fail” is

cublasAlloc(elementN, sizeof(float), (void **) &GPUVect1);

the fact is that after this instruction &GPUVect1 contains an address (0x0012fc08)

and at this location there is another address (0x001102a400) and not the value!

Inside this other address there is an expression impossible to evaluate…

How come this happens

That’s right, you can’t read GPU memory in this way.

ok i solved it. the stride in the Sdot function is 1 and not sizeof(float) and i didn’t do the cast from double to float correctly. :)

Thanks guys

I still don’t undestand how to compute, for example, the dot product of the third row of a matrix and the fifth column of another matrix, if that can be done by cublasSdot function. Thank you in advance.

Yes you can :-) Assuming column-major storage (which is what CUBLAS uses in general), all elements of a column are stored consecutively. The elements of a row on the other hand are strided by the leading dimension (i.e. the number of rows of the matrix). So in your example one would pass to SDOT:

x     start address of the third row of matrix A

incx  LDA (leading dimension of A)

y     start address of the fifth column of matrix B

incy  1

Thanks a lot! I can see now that it’s really easy to use…