cublasScal

I am trying to implement a simple code that will multiply the first column of a matrix A by a scalar alpha using cublas library function. I know how to allocate the memory and set up the matrix in GPU, but I am not sure what argument to be used in the cublasScal function.

I have transferred a m times n matrix A in CPU to matrix gA in GPU using

cublasSetMatrix(m,n,sizeof(float),A,m,(void*)gA,m);

And attempt to use cublasScal in the following way

(void)cublasSscal(m,alpha,gA,1);

When I run the code, the output is exactly the same as the input. How should I rectify this?

Thanks in advanced.

[codebox]void mexFunction(int nlhs, mxArray *plhs, int nrhs, const mxArray *prhs)

{

int m,n;

int dims0[2];

float *A,*gA,*ans;

float alpha=4;

m=mxGetM(prhs[0]);

n=mxGetN(prhs[0]);

dims0[0]=m;

dims0[1]=n;

plhs[0]=mxCreateNumericArray(2,dims0,mxSINGLE_CLASS,mxREAL);

ans=(float*)mxGetData(plhs[0]);

A=(float*)mxGetData(prhs[0]);

cublasInit();

cublasAlloc(m*n,sizeof(float),(void**)&gA);

cudaMemset(gA,0,mn4);

cublasSetMatrix(m,n,sizeof(float),A,m,(void*)gA,m);

cublasGetMatrix(m,n,sizeof(float),gA,m,ans,m);

(void)cublasSscal(m,alpha,gA,1);

cublasFree(gA);

}[/codebox]

[EDIT]

embarass I call for the library function after I copy the matrix. Silly me. Never notice it.

Problem is solved now.

You may want to call GetMatrix after the cublasSscal call.