Cuda for loop parallezation

i want to parallelize two for loop with using cuda. My simple c++ code is:

index=0;
	for (int m = 0; m < N; m++)
	{
		x[m] = (2 * m + 1) / (2 * N);
		for (int n = 0; n < N; n++)
		{
			Z1[n] = (n + 1) + x[m];
			Z[index] = Z1[n];
			index++;
		}
	}

i write cuda version is that:

    int tidx = blockIdx.x*blockDim.x + threadIdx.x;
    int tidy = blockIdx.y*blockDim.y + threadIdx.y;
    if  (tidx < N && tidy < N)
    {
        x[tidx] = (2 * tidx + 1) / (2 * N);
        Z1[tidy] = (tidy + 1) * 100 + x[tidx];
    }
Z[tidx]=Z1[tidy];
}

When i run code. the first row of Z is correct and the others are equal to zerro. Z dimension is [nxn]. i dont understand where is the mistake? Can you help me thanks…

Hi skymoon
As in you code, variable “tidx” was been never navigating any of Y axis block or grid. See your first cuda code.
Versa case, “tidy”, is also as on.
And, There was not guaranteed Z[tidx] is must to refering of progressed Z1[tidy].
Hope it goes well