Why does this programm crash

Hi,
I made a minimal sample of a programm that crashes for large N. Can anybody tell me why that happens?
I know the code doesn’t make any sense but it’s just a sample… When i start cudaErrorMaker with N=100000 it works fine, with N=1000000 my computer crashes.
When I cancel the for-loop in the global function and just start once “cudaArray[i+j*lengthA] = 0.0f;” it works fine as well. I think thats weird !?!? Any ideas?

void cudaErrorMaker(long N)
{
dim3 DimGrid(8, 8);
dim3 DimBlock(16, 16);

float cudaArray;
CUDA_SAFE_CALL(cudaMalloc((void
*)&cudaArray, DimGrid.xDimGrid.yDimBlock.x*DimBlock.y * sizeof(float)));

cudaError <<< DimGrid, DimBlock >>>(cudaArray, N, DimGrid.x*DimBlock.x);

CUDA_SAFE_CALL(cudaFree(cudaArray));
}

global void cudaError(float* cudaArray, long N, long lengthA)
{
long i = blockDim.x * blockIdx.x + threadIdx.x;
long j = blockDim.y * blockIdx.y + threadIdx.y;
long No;
for(No=0; No < N; No++)
{
cudaArray[i+j*lengthA] = 0.0f;
}
}

If a kernel takes too long (more than a few seconds) it will trigger the watchdog timer in Windows and fail.

Thanks!!!
That means it is just a windows problem, right? Can I switch off the watchdog?

The watchdog can’t be switched off, as it is built into the driver (Linux as well). You can use a secondary GPU without a display attached, and that should have the watchdog swithced off.