Hi everyone,
I have two questions:
(1)My kernel function has lots of if-else sentences to execute, and it costs much time, can anyone give me some suggestions on how to optimize? Below is part of my kernel
device void my_kernel(int *d2, float *d1)
{
unsigned int xIndex = blockDim.x * blockIdx.x + threadIdx.x;
int i,j;
int index1,index2,sig;
float min1,min2;
float LLR;
while(xIndex<1024)
{
for(j=0;j<d1[xIndex];j++)
{
LLR=fabs(d1[ d2[N+1+xIndex] +j ]);
if(LLR<min2)
{
if(LLR<min1)
{
min2=min1;
min1=LLR;
}
else
{
min2=LLR;
}
}
}
if (min1<0.0) min1=0.0;
if (min2<0.0) min2=0.0;
__syncthreads();
xIndex += blockDim.x;
}
}
(2) Are the instructions
__syncthreads();
xIndex += blockDim.x;
in the above kernel must be needed in the program? when I omit the two instructions, the kernel seems go to a dead lock.
Thanks a lot