Matrix column in shared memory

Mihrimah · February 22, 2017, 12:40pm

Hi !
I’m new in CUDA programming and i’m trying to store each column of a matrix (10000 by 5) in shared memory. I wrote this :
shared int *col;
col[threadIdx.x] = mat_dev[threadIdx.x][blockIdx.x]; Then i realized that the number of lines is bigger than the number of threads per block, So how can i store each column by slice
Thank you

cbuchner1 · February 22, 2017, 10:47pm

What you declared is a pointer that lives in shared memory. The pointer could be pointing to anything. Global memory, shared memory, local memory.

In your code snippet you don’t initialize the pointer value, so you’ll just write to a random memory location which results in an unspecified launch failure (segfault on GPU)

Mihrimah · February 23, 2017, 10:33am

Thank you.
Now i’m trying to use a dynamic array, but i still can’t fix the problem of the number of lines and the number of threads per block