CUBLAS problems

Are there any examples of using level 2 CUBLAS routines ?

I am trying an example using cublasStrsv. The cublas call seems to succeed, however when request the results vector, get the error CUBLAS_STATUS_MAPPING_ERROR.

I am a bit unsure about the required format of the input Matrix. In the example below, I am just filling in every element of a N x N matrix. So the number of subdiagonals and superdiagonals should be N-1.

Any ideas why my test fails ?

#define TEST_SIZE 10
void testStrsv(void)
{
float * a;
float * b;
float alpha = 1.0;
float beta = 1.0;
float * x;
float * y;
float * devPtrX;
float * devPtrY;
float * devPtrA;

//y = alpha * A * x + (beta * y)
printf("\nStrsv\n");

a = (float *)malloc(TEST_SIZE * TEST_SIZE * sizeof(a));
x = (float *)malloc(TEST_SIZE * sizeof(x));
y = (float *)malloc(TEST_SIZE * sizeof(y));

stat = cublasAlloc(TEST_SIZE * TEST_SIZE,sizeof(a),(void**)&devPtrA);
if(stat != CUBLAS_STATUS_SUCCESS)
	printf("1. Error %d\n",stat);

stat = cublasAlloc(TEST_SIZE,sizeof(x),(void**)&devPtrY);
if(stat != CUBLAS_STATUS_SUCCESS)
	printf("2. Error %d\n",stat);

stat = cublasAlloc(TEST_SIZE,sizeof(y),(void**)&devPtrX);
if(stat != CUBLAS_STATUS_SUCCESS)
	printf("3. Error %d\n",stat);


for (int i=0;i<TEST_SIZE;i++)
{
	for (int j=0;j<TEST_SIZE;j++)
		a[IDX2F(i,j,TEST_SIZE)] = i * TEST_SIZE + j + 1;
}
for (int i=0;i<TEST_SIZE;i++)
{
	x[i] = 1;
	y[i] = 1;
}

stat = cublasSetVector(TEST_SIZE,sizeof(a),x,1,devPtrX,1);
stat = cublasSetVector(TEST_SIZE,sizeof(a),y,1,devPtrY,1);
stat = cublasSetMatrix(TEST_SIZE,TEST_SIZE,sizeof(a),a,1,devPtrA,1)

;

cublasSgbmv('N', //trans - 'N' = operation is A, 'T' = operation is A'
			TEST_SIZE, //number of rows
			TEST_SIZE, //number of columns
			TEST_SIZE -1, //number of sub diagonals of A
			TEST_SIZE -1, //number of super diagonals of A
			alpha,
			a,
			TEST_SIZE + TEST_SIZE  - 1, //leading dimemsion of A
			x,
			1,
			beta,
			y,
			1);
stat = cublasGetError();			
if(stat != CUBLAS_STATUS_SUCCESS)
	printf("4. Error %d\n",stat);

stat = cublasGetVector(TEST_SIZE,sizeof(y),devPtrY,1,y,1);
if(stat != CUBLAS_STATUS_SUCCESS)
	printf("5. Error %d\n",stat);

for (int i=0;i<TEST_SIZE;i++)
	printf("%f ",y[i]);

cublasFree(devPtrA);
cublasFree(devPtrX);
cublasFree(devPtrY);
free(a);
free(x);
free(y);

}

Are you running a 64-bit OS, if so

sizeof(a) = 8
while
sizeof(float)=4

because a is a pointer, not a primitive.
The same applies to a number of sizeof() calculations in your code.

N.

Yes sorry, the code is scrappy as it was just meant to be a super quick test for me. Its 32-bit

I believe the problem is to do with the cublasSgbmv() call, as with out this, I can get the results vector y fine. After further investigation and reading about the general-band format, I was wondering if the problem lies with this. But I haven’t found any examples yet of cublas working with matrices and the level 2 funtions. I was hoping they will work for general matrices.

I don’t understand why you are using SGBMV() for this sort of problem. Unless you really have a dense, symmetric banded matrix (like a tridiagonal matrix, for example), you should probably try using SGEMV(), which is the general matrix-vector product.

I think you’re misinterpreting the values required for the leading dimension.

e.g.

stat = cublasSetMatrix(TEST_SIZE,TEST_SIZE,sizeof(a),a,1,devPtrA,1)

should probably be

stat = cublasSetMatrix(TEST_SIZE,TEST_SIZE,sizeof(a),a,TEST_SIZE,de

vPtrA,TEST_SIZE)

N.

Thanks Nico. That is deinately a bug in my code. But unfortunately, I still get the same problem after calling cublasSgbmv().

You’re also using a Fortran based indexing scheme in your C code.

Use

define IDX2C(i,j,ld) (((j)*(ld))+(i))

instead of IDX2F

N.

Thanks that was well spotted. That was just a typo though from when I had been trying 1 based indexing and I had previously altered the define to be 0-based before realising there was one already called IDX2C. Unfortunately, I still get the same result. I just can’t seem to work out why I can not get the vector results after making the cublas call. I also tried sgbmv but I get the same result.

I was just picking the first one in the manual to get familiar with how the calls work. I would had thought by still using the correct parameters with any matrix I could get sgbmv to work. I tried a siliar example using sgbmv and i still get CUBLAS_STATUS_MAPPING_ERROR when trying to call cublasGetVector afterwards.

You’re also not sending device pointers in the cublas call, but host pointers. These are too many mistakes, I suggest that you try and understand the simpleCublas example in the CUDA SDK first.

N.