CUDA call locking system

I have a cuda application that, when run on my primary display adapter, causes the display to look weired (gets kind of snowy, and blocks of pixels look like they are being skewed), and locks the graphics card so the monitor won’t change. I have waited for a few minutes for my application to exit, but it never did, so I performed a hard-reboot. I was under the impression that Windows would kill my CUDA code and release all allocated resources after only 5-10 seconds. Is there something that I am missing here? One of the potential issues that I think may be causing this is that my kernel takes an array as an argument. However, the array is only on the host side, and I am pretty certain that arrays are passed by reference, which may be causing me to read and write to invalid addresses. My kernel is currently similar to the following:

global void Kernel(float* input[2]);

Do I need to change this to a float**? Something is different when running on the device, because my kernel is not failing in emulation mode. Thanks,


if you pass host address into kernel function, then it will return error messge “invalid address”.

I think your pointer array input[2] should be constructed as device memory, say

"float *input " points to device memory with two integer (float) which is a pointer also.

The simplest way is to change your kernel, say

global void Kernel(float* input1, float* input2 );

then in your main(), you can do

float* input[2] ;

cutilSafeCall( cudaMalloc((void**)&input[0], size) );

cutilSafeCall( cudaMalloc((void**)&input[1], size) );

dim3 block(16,16,1);

dim3 grid(32,1,1) ;

kernel<<<grid, block >>>( input[0], input[1]) ;