Code work in Emulation mode but not with the Card...

SAMPUN · July 29, 2010, 7:14am

Hi,

Length = 120,
iter = 8

abc[0][0] = 0;
for(int m = 1; m < iter; m++)
abc[0][m] = 0;
for(int k = 1; k <= Length; k++)
{
x[k] = 0;
for(m = 0; m < iter; m++)
{
abc[k][m] = (abc[k-1][m] + y[k-1][m])>( abc[k-1][m] + y[k-1][m]) ? (abc[k-1][m] + y[k-1][m]) : ( abc[k-1][m] + y[k-1][m]);
x[k] = x[k]> abc[k][m] ? x[k] : abc[k][m];
}
for(m = 0; m < iter; m++)
{
abc[k][m] = abc[k][m]-x[k];
}
}

Converted the above loop iterations into cuda…

#define MAX(X, Y) (X > Y)? X : Y

global
void kernel_alp(float* alp,float* total,float* gg,int codeLength)
{

float x1,x2;
float temp[8],temp1[8];
int bx = blockIdx.x;
int tidInit = blockIdx.x*blockDim.x+threadIdx.x;

if(bx == 0){
	alp[tidInit] = (float) -INFINITY;
	alp[0] = 0;
	__syncthreads();
}
else{
	int tidZero = ((bx-1)*blockDim.x*2)+(to[threadIdx.x][0]*2);
	int tidOne = ((bx-1)*blockDim.x*2)+(to[threadIdx.x][1]*2+1);

	if(threadIdx.x<8)
	{
		temp[threadIdx.x]=alp[(bx-1)*blockDim.x+to[threadIdx.x][0]];
		temp1[threadIdx.x]=alp[(bx-1)*blockDim.x+to[threadIdx.x][1]];
	}
	x1 = temp[threadIdx.x]+gg[tidZero];
	x2 = temp1[threadIdx.x]+gg[tidOne];
	alp[tidInit] = MAX(x1,x2);
	if(threadIdx.x == 0)
	{
		total[bx] = (float) -INFINITY;
	}
	total[bx] = MAX(total[bx],alp[tidInit]);
	__syncthreads();
}
total[bx] = MAX(total[bx],alp[tidInit]);
alp[tidInit] =alp[tidInit]-total[bx];	
__syncthreads();

}
BlkLength = 120;
BLOCK = 8;
kernel_alp<<< (BlkLength), BLOCK>>>(alp,total,gg,BlkLength-1);

This code works fine in Emulation mode…
But doesnot work with the NVIDIA Card …Please check the problem…

Thanks…!

Lev · July 29, 2010, 8:49am

You have syncthreads in if block, it will not work if if in not always work same way for all threads.

Topic		Replies	Views
Not able to use _syncthreads inside a loop in emulation mode But it works fine without emulation&#33 CUDA Programming and Performance	1	1101	May 5, 2009
Different result between emulation and real intractable bug CUDA Programming and Performance	2	4219	December 13, 2007
__syncthreads screwes calculation CUDA Programming and Performance	2	3420	November 22, 2007
does this code have problem? CUDA Programming and Performance	6	3952	December 9, 2007
Syncthreads and Stalling Kernels CUDA Programming and Performance	16	4137	August 26, 2010
syncthreads error? CUDA Programming and Performance	16	33368	June 2, 2008
Bug in emulation mode and __syncthreads()? Kernel stops abruptly CUDA Programming and Performance	2	3005	May 14, 2009
cuda syncthreads fail CUDA Programming and Performance	7	3876	February 22, 2013
Possible Issue with Cuda 2.3 and __syncthreads() Emulation CUDA Programming and Performance	1	905	November 6, 2009
what 'incorrect use of __syncthreads()' means ? CUDA Programming and Performance	4	2994	September 9, 2008

Code work in Emulation mode but not with the Card...

Related topics