problem with matrix of std::vector matrix of std::vector

Hello !! I have a problem with a matrix defines as std::vector< std::vector >.

The matrix is allocated in memory lineary, but between row and row exists a padding (a number of memory position with any value normaly 0.0).

In the kernel, the code for reference the column is :

x=threadIdx.x+blockIdx.x*ROWS_BY_BLOCK

y=threadIdx.y+blockIdx.y*COLS_BY_BLOCK

if (x==0) col=y;

else col=(x*NUM_COLS)+padding[x-1]+y;

In the first row not exist padding, but there is in the next rows.

padding[x-1] is the accumulate padding of the previous row,

This works fine, the problem is:

1.- In some cases, for example, a matrix of 2 rows and 3000 columns, the padding of the second row is very big. And when I copy the matrix from host to gpu, I need to copy the memory of the padding too. The memory grow enormely.

2.- In some cases, calculate the array of padding is very expensive in computation because I need pass by all the memory positions for detect a change of row and count this padding (the padding not is constant between rows), and this for all rows.

The first solution is use two loop for pass from std::vector<std::vector to a structure like float[…][…] or float ** (in this structures not exist padding), but i don’t want to do it by the time spent.

I’ve tried with the capacity method of STL but it don’t tell us this padding.

I use memcpy to copy memory from host to device and cudamalloc for reserve linear memory in the gpu.

What’s happen??

How can i resolve it??

Best regards, Francisco - Spain.

std::vector guarantees that you get a contiguous block of floats, but for your vector of vector’s, there would not be any such guarantee.

You might, instead, use something like

std::vector matrix_mem(M*N, 0.0);

std::vector<float*> rows;

for (int i = 0; i < M; ++i) rows.push_back(&matrix_mem[i*N]);

Then you can use rows[i][j] to access the i,j column, and, at the same time,

your memory will be layed out as you expect…

-dn

Thanks by your fast reply, the problem is that I can’t do it because a receive this structure of matrix from another application and I can’t change the definition. Too I don’t want to use a loop for pass the values to a vector .

Do you know another solution??

Best Regards,