NVIDIA Developer Forums

Limitation in Matrix Size Passed into Kernel still at n00b status

Accelerated Computing CUDA CUDA Programming and Performance

KChou November 1, 2010, 8:47pm 1

Hi all.

I have a program that dilates a AxA matrix by BxB matrix, and I’m trying to find out limitations of the program I wrote.
So I try to dilate 4096 x 4096 matrix by a 6x6 matrix, and the program runs fine.
But if I dilate 4096 x 4096 matrix by a 8x8 matrix, then as my program gets to the kernel code, my monitor turns black, and comes back on a second later. The program fails, and Windows pops up a message saying that ‘Display Driver Stopped Responding and Has Recovered’.

Does this mean my graphics card doesn’t have enough memory to handle this operation? I don’t think its the case though…

My program requires AxA processors run in parallel, each processor performs a nested for loop B times for each loop.

I haven’t tried optimizing my kernel yet. Would this problem go away after I optimize my code (shared memory, etc)?

Thanks in advance!

KChou November 1, 2010, 8:47pm 2

Hi all.

I have a program that dilates a AxA matrix by BxB matrix, and I’m trying to find out limitations of the program I wrote.
So I try to dilate 4096 x 4096 matrix by a 6x6 matrix, and the program runs fine.
But if I dilate 4096 x 4096 matrix by a 8x8 matrix, then as my program gets to the kernel code, my monitor turns black, and comes back on a second later. The program fails, and Windows pops up a message saying that ‘Display Driver Stopped Responding and Has Recovered’.

Does this mean my graphics card doesn’t have enough memory to handle this operation? I don’t think its the case though…

My program requires AxA processors run in parallel, each processor performs a nested for loop B times for each loop.

I haven’t tried optimizing my kernel yet. Would this problem go away after I optimize my code (shared memory, etc)?

Thanks in advance!

tera November 1, 2010, 8:59pm 3

It means that the watchdog timer has triggered, which is intended as a last measure to prevent your computer (screen) from locking up from runaway GPU code.

If you optimize your code, you will be able to work on a larger matrix until the watchdog sets in. Alternatively, you can split the workload between multiple consecutive kernel invocations, which gives the screen a chance to update and resets the watchdog timer.

tera November 1, 2010, 8:59pm 4

It means that the watchdog timer has triggered, which is intended as a last measure to prevent your computer (screen) from locking up from runaway GPU code.

If you optimize your code, you will be able to work on a larger matrix until the watchdog sets in. Alternatively, you can split the workload between multiple consecutive kernel invocations, which gives the screen a chance to update and resets the watchdog timer.

KChou November 2, 2010, 12:16am 5

Got it. I’ll try to optimize my code. Thanks!

KChou November 2, 2010, 12:16am 6

Got it. I’ll try to optimize my code. Thanks!