Help me! simple kernel for product matrix and vector!

I’m now implement a kernel function for simply 1 matrix * 1 vector.
However, I’m just confusing how I can be implement a good way.

I would like to product 1 row by the vector respectively.

matrix ->
a b c d
e f g h

vector ->
1 2 3 4

a1 b2 c3 d4
e1 f2 g3 h4

I always confuse how I choose grid and block dimension well…

if I use
width = 1024
height = 2400
vector size = 1024 ,
then how I can set up the grid and block size?
(I’m using titan X)

Could anyone can explain for my question?
Thank you!

Regarding blocks, grids, see e.g

For matrix-vector multiplication, i would use the cublas library.


Thank you for your reply!