kernel fails over many iterations

HorseBadger · November 25, 2011, 9:43am

I’ve been having a problem with a kernel that’s been puzzling me. Here’s a simplified kernel that has a problem:

#define vBS 16

__global__ static void test_kernel(double *X, double *H, int Ns, int w, int p) {

	int i, j;

	/* tx and ty are the thread coordinates within the sub blocks */

	int tx = threadIdx.x;

	int ty = threadIdx.y;

	

	/* get the x and y coordinate of X that this thread works on */

	int x = __mul24(blockIdx.x, blockDim.x) + tx;

	int y = __mul24(blockIdx.y, blockDim.y) + ty;

	double s = 0;

	__shared__ float3 L1[vBS][vBS];

	__shared__ float3 L2[vBS][vBS];

	float4 l1, l2;

	for(i = 0; i < 1024; i++) {

		L1[tx][ty] = make_float3(0.1, 0.2, 0.3);

        L2[tx][ty] = make_float3(0.1, 0.2, 0.3);

        __syncthreads();

double t = 0;        

		/* now perform the multiplication */

		for(j = 0; j < vBS; j++) {

			t += (double)L1[j][tx].x*(double)L2[ty][j].x; 

			t += (double)L1[j][tx].y*(double)L2[ty][j].y; 

			t += (double)L1[j][tx].z*(double)L2[ty][j].z; 

		}

		s += t;

	}

	X[x + __mul24(y, p)] = s;

}

If I change the loop to run over a small number of iterations, e.g 1024 it works. 4096 and it fails (nvidia driver crashes and screen goes blank. I don’t get a useful error message). However, it will run with a larger number of iterations if I comment out ‘s += t;’. I can’t understand what could be wrong here; is there such a thing as double overflow?!

Hope someone can help me!

tera · November 25, 2011, 3:38pm

You are probably triggering the watchdog timer that terminates kernels after 2 to 5 seconds to keep the GUI responsive. Either run CUDA on a dedicated GPU, or do less work per kernel invocation. The latter might also require a cudaStreamSynchronize(0) between kernels so that the watchdog is restarted after each kernel.

Topic		Replies	Views
kernel failed after few invokation CUDA Programming and Performance	9	7861	October 30, 2010
limitations on repeatitive computation? CUDA Programming and Performance	5	818	August 14, 2011
Limitation to number of loop iterations? CUDA Programming and Performance	3	3471	June 6, 2011
Kernel crashes when called multiple times (inside a loop) CUDA Programming and Performance	0	528	January 28, 2016
loop inside a kernel How many interrations? CUDA Programming and Performance	3	3240	July 20, 2009
CUDA crashing when I iterate too many times CUDA Programming and Performance	0	3868	January 10, 2010
Error on iteration of cuda kernel CUDA Programming and Performance	4	4388	July 11, 2011
CUDA limit for loops..? too large number of iterations? CUDA Programming and Performance	28	27589	March 20, 2008
ask for help with weird " unspecified launch failure" CUDA Programming and Performance	9	4131	November 5, 2010
Too much threads makes computer crashing If this kernell takes a long time to complete, I got a blue CUDA Programming and Performance	7	2083	April 24, 2009

kernel fails over many iterations

Related topics