CUBLAS question


I have a simple C function that computes the inner product between the part of a column of a rectangular matrix with the corresponding part of another rectangular matrix.

The code in C is simply as follows:


double inner_mat_col( int l, int u, int i, int j, double a, double b )


double s = 0.0;

for (; l < u; ++l) s+= a[l][i] * b[l][j]);

return s;



The code here is very simple. I am just iterating through all the column vector elements in a and the column vector elements in b (the column in a is specified by the index i and the column in b is specified by index j) and accumulating the product.

Is there a simple way to do the equivalant in CUBLAS. I have been tinkering around with CUDA for a few months but I have no experience with CUBLAS and I would be really grateful for some pointers.

Thanks for any help you can give me,


I think that you should use cublasDdot with incx=1 and incy=1because the matrices are column major,

x=&a[l] . y=&b[l] and n = u -l

x and y have to be device pointers ( initialized though cublasAlloc or cudaMalloc )

Wow. That was fast! Thanks for that. I will give it a shot.

I also had posted the question in the development forum after I realized this might not be the best place for it. So, sorry for that.

Many thanks!