Hi,

I am kind of stuck at a very simple problem…

Here are the details:

- I have a 2225 x 9635 2D array as input
- From this one I want to generate a 2225 x 2225 array, which is just Dot product of 2 rows in input array

Here is the Kernel for the same:

#define DOCS 2225

#define TERM 9635

**global** void compute_similarity(float * term_doc_M, float * weight_mat,float cut_off)

{

int bx = blockIdx.x;

int tx = threadIdx.x;

int by = blockIdx.y;

int ty = threadIdx.y;

```
int i = by* blockDim.y + ty;
int j = bx* blockDim.x + tx;
if(i < DOCS && j < DOCS)
weight_mat[i* DOCS + j ]= 0 ;
float tmp =0, tmp1=0, tmp2=0;
if(i < DOCS && j < DOCS)
{
if(i == j)
{
weight_mat[i* DOCS + j ]=1 ;
return ;
}
if(j > i)
return;
for(int k =0 ; k < TERMS; k++)
{
tmp += term_doc_M[j* TERMS + k] * term_doc_M[i*TERMS+k];
}
weight_mat[i* DOCS + j ]= tmp;
weight_mat[j* DOCS + i ]=tmp ;
}
```

}

Now, problem here is: I am getting answer as all 0s. Dont know why…

Here are things I have verified:

- The input array contains correct data and its not 0.
- When i try to access ith row, it is givin correct data. But when i try to access jth row, it is returning all 0s :( It does not seem to follow any logic. If i try to access a single element instead of entire row for j, it returns correct value again… I am totally clueless what is happening here… :(

Any help is really appreciated!

Thank you!