Hi,

Note: for the sake of explanation, i’ll use matrix notations in the Kernel code as well.

Lets say I have an input square matrix A (m*m), and I have two output square matrices to create out of it, X (m*m) and Y (m*m).

What I want to do is, if the if-loop condition satisfies, then corresponding entries from A should be copied to X and then, during the next conditional loop, corresponding entries from “UPDATED” X to Y. But it doesn’t seem to copy the updated values…

```
__global__ kernel( the input matrix A and output matrices X and Y) {
int tx = threadIdx.x;
int ty = threadIdy.y;
if (tx <=ty)
X[tx][ty] = A[tx][ty];
__syncthreads();
if (tx > ty)
Y[tx][ty] = X[tx][ty];
}
```

The first if-loop should create an upper triangular matrix with entries of A and the second if-loop should create a lower triangular matrix with entries from UPDATED X.

What do you think is the problem behind this logic? I am more interested in the concept than the EXACT code.

thank you,

Arun.