# Problem with 2D block

Hello,

I’m trying to store each bock’s x and y in a 2D array, and turn them back to the host, my code is:

``````#include <iostream>

#define N 95

*(ch + blockIdx.x * 2) = blockIdx.x;
*(ch + (blockIdx.x * 2) + 1) = blockIdx.y;
}
int main(void) {

int h_ch[N][2];
int *dev_ch;

cudaMalloc((void**)&dev_ch, sizeof(int[N][2]));

dim3 numBlocks(N,N);

add << <numBlocks, 1 >> > (dev_ch);

cudaMemcpy(h_ch, dev_ch, sizeof(int[N][2]), cudaMemcpyDeviceToHost);

for (int i = 0; i < N; i++) {
printf("%d-%d\n", h_ch[i][0], h_ch[i][1]);
}

cudaFree(dev_ch);

return 0;
}
``````

The result of this code is:

``````cuda_test.exe
0-94
1-94
2-94
3-94
4-94
5-94
6-94
7-94
8-94
9-94
10-94
11-94
.
.
.
92-94
93-94
94-94
``````

As can be seen, the value of x is changed (as expected), but the y is 94 for all rows. I was wondering what is wrong? Is something wrong with the code, or I understood the concept of grids/blocks/threads wrong?

So you have an NxN grid of thread blocks that try to store into an Nx2 array of results? Something tells me that’s not going to work well …

Yes, exactly. what is the problem ?

There are NxN = 95*95 = 9025 thread blocks. You want to store each block’s [x,y] coordinates in an array. How many elements does that array need? I would claim 9025 elements, because that’s how many blocks there are. How large an array does the code provide?

So, I should allocate memory for sizeof(int[N*N][2]) ? And how should I calculate index for the storing?

Hint: Much can be learned by (1) thinking and (2) experimenting. I am absolutely certain you can figure this out.

https://en.wikipedia.org/wiki/Think_(IBM)

I think the correct calculation for indexes is:

``````int idx = (blockIdx.x * N) + (blockIdx.y * 2);

*(ch + idx) = blockIdx.x;
*(ch + ++idx) = blockIdx.y;
``````

And also, allocated sizeof(int[N*N][2]) in memory. However, it didn’t work fine so I changed my code to see how x and y vary, the x changes from 0 to 94 and is repeated 94 times for each number, but, the y varies from 51 to 94 and gets repeated 94 times for each number.

What is wrong with y? shouldn’t it begin from 0 and end up to 94?