CUDA Driver crashing

Hello there,

i am running the CUDA SDK 3.2 with Visual Studio 2008 on Windows 7 with a single NVIDIA GPU installed. The GPU is doing the displaying aswell because i don’t have an onboard display port.

I have a kernel which i supply with preallocated data. The processing time varies depending on the size of the data. The kernel executes for about 500ms and everything goes ok. If it takes longer (because of the input data), the kernel runs ok, too, but any call to the CUDA environment, like

cudaEventRecord(stop, 0)

or a


fail with my screen going black and a message from Visual Studio that the application had a first-chance exception at some memory location. When the screen returns to normal i get a message from the NVIDIA driver, that it had crashed but recovered from it.

So my question, what may be causing this strange behaviour? No matter how large the data is, i can step over the kernel invocation and the next call fails with a black screen. This seems weird, as the kernel seems to execute just fine with small data and therefore low execution time.

For me it sounds like a buffer overrun inside the kernel which causes the CUDA environment to crash afterwards which doesn’t seem reasonable at all because my data is always allocated the same, just with different sizes.

Best regards and thanks for your input, tdhd

I think you are doing something seriously wrong in your code. Can we take a look at your code. Are you checking for returned error codes to all cuda* function calls?

You may also update your driver to the latest 275.xx and see if the problem persists…

Yes i check all of my cuda* calls from which none fail beforehand.

The weird thing about this is, that i have a fully working C++ version (running on the host only) of it. And what i did was to copy paste the code.

My functions looks like this:

__global__ void kernel(float *A, float *b)


// A and b preallocated on hostside

// fill matrices A and b

// call system solver

// work with results

// return


which is called like that, for testing only of course

kernel<<<1, 1>>>(A, b);

The main function is filling a matrix A and b, which are passed as arguments to the kernel.

The kernel itself is working with arbritrary matrix sizes. Only when invoking the system solver i get the first chance exceptions.

But again, the system solver is copy pasted from the host version, which is working.

May i PM you the code, too?

Best regards

The latest driver 275.33 fails, too.