Vector and matrix multiplication

Hello Boys,

I just wonder how to make multiplication matrix * vector. Point is that there are some restrictions for min. number of threads, and as we know in a vector there is only one column.

Now I do it like that : when I try to multiply A * x , A -matrix [32 x32], x - vector [32x1]
with block size 16, unless I do not suplement vector by zeros to x =[32x32] where first column include the proper values, and supplied are 0. In this way I lose lots of memory space.

Any pieces of advice ?

Thank you for your help.


Can you not just use the CUBLAS stuff for this? I could be wrong but i think one of the BLAS level 2 algorithms will do this for you.