I am trying to implement a simple code that will multiply the first column of a matrix A by a scalar alpha using cublas library function. I know how to allocate the memory and set up the matrix in GPU, but I am not sure what argument to be used in the cublasScal function.

I have transferred a m times n matrix A in CPU to matrix gA in GPU using

cublasSetMatrix(m,n,sizeof(float),A,m,(void*)gA,m);

And attempt to use cublasScal in the following way

(void)cublasSscal(m,alpha,gA,1);

When I run the code, the output is exactly the same as the input. How should I rectify this?

Thanks in advanced.