From this one I want to generate a 2225 x 2225 array, which is just Dot product of 2 rows in input array
Here is the Kernel for the same: #define DOCS 2225 #define TERM 9635
global void compute_similarity(float * term_doc_M, float * weight_mat,float cut_off)
{
int bx = blockIdx.x;
int tx = threadIdx.x;
int by = blockIdx.y;
int ty = threadIdx.y;
Now, problem here is: I am getting answer as all 0s. Dont know why…
Here are things I have verified:
The input array contains correct data and its not 0.
When i try to access ith row, it is givin correct data. But when i try to access jth row, it is returning all 0s :( It does not seem to follow any logic. If i try to access a single element instead of entire row for j, it returns correct value again… I am totally clueless what is happening here… :(
From this one I want to generate a 2225 x 2225 array, which is just Dot product of 2 rows in input array
Here is the Kernel for the same: #define DOCS 2225 #define TERM 9635
global void compute_similarity(float * term_doc_M, float * weight_mat,float cut_off)
{
int bx = blockIdx.x;
int tx = threadIdx.x;
int by = blockIdx.y;
int ty = threadIdx.y;
Now, problem here is: I am getting answer as all 0s. Dont know why…
Here are things I have verified:
The input array contains correct data and its not 0.
When i try to access ith row, it is givin correct data. But when i try to access jth row, it is returning all 0s :( It does not seem to follow any logic. If i try to access a single element instead of entire row for j, it returns correct value again… I am totally clueless what is happening here… :(
It is just similar to that… Is it like I cannot do a matrix multiplication for matrices of size 2000 x 9000 and 9000 x 2000 lets say…? Because my kernel does not launch at all…
It is just similar to that… Is it like I cannot do a matrix multiplication for matrices of size 2000 x 9000 and 9000 x 2000 lets say…? Because my kernel does not launch at all…
check it because you carry out the computation for j <= i while intialization is done even for j > i which means that some threads are going to initiate what has already been done !
check it because you carry out the computation for j <= i while intialization is done even for j > i which means that some threads are going to initiate what has already been done !