Help me! simple kernel for product matrix and vector!

Hi,
I’m now implement a kernel function for simply 1 matrix * 1 vector.
However, I’m just confusing how I can be implement a good way.

I would like to product 1 row by the vector respectively.

if
matrix →
a b c d
e f g h

vector →
1 2 3 4

result
a1 b2 c3 d4
e1 f2 g3 h4

I always confuse how I choose grid and block dimension well…

if I use
width = 1024
height = 2400
vector size = 1024 ,
then how I can set up the grid and block size?
(I’m using titan X)

Could anyone can explain for my question?
Thank you!

Regarding blocks, grids, see e.g nvidia - Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) - Stack Overflow

For matrix-vector multiplication, i would use the cublas library.
See c++ - CUDA/CUBLAS Matrix-Vector Multiplication - Stack Overflow

HannesF99

Thank you for your reply!