Kernel hangs > 10 min.. possible reasons ? then I reboot..

nitin.life · July 1, 2009, 11:43pm

What can the most probable causes for a kernel which just starts and hangs ? have been debugging it more > 5 hrs … it just hangs…

I AM not using any communication via shared memory… all threads are working independently, hence I don’t know why is it happening ?

I am using ~ 350 MB of device memory.

Any probable reason ? greatly appreciate your input after 5+ hrs of debugging :(

I can paste the kernel code here or attach it if some one wants to see…

Thanks,

NA

Sarnath · July 2, 2009, 5:00am

Code Review is the panacea for all software bugs. Take a print out, and review it slowly over a cup of coffee.

It is so tempting to blame the hardware… I do it all the times only to find that its a software bug… :-)

Mu-Chi_Sung · July 2, 2009, 5:45am

Did you use syncthreads() in your kernel? If so, make sure that each thread calls exactly the same number of syncthreads()…

Another cause could be some infinite loop inside your kernel, so remember to check your for loop termination conditions.

eyalhir74 · July 2, 2009, 6:51am

As Sarnath specified you should review your code. You could also try to post it here.

A fast way, btw, would be to comment your entire kernel code, see it works ok and then start to open kernel lines till

you reach the offending code/loop. Probably something causes you a deadlock or infinite loop.

In this process make sure your kernel doesnt get optimized out by the dead-code optimizer.

eyal

nitin.life · July 2, 2009, 11:36am

Thanks… for the advice :)… found the bug

Yes it finally is working though not very good… just achieved 12 Gflops in double precision with 6x speed up.

:(

No as there is no inter-thread data communication hence I don’t use sync threads. Thanks for the input :) .

Yup that helped actually… the bug was in my code… Unallocated shared memory access inside a loop. Lame me… I was hoping for more dramatic performance, I found that I am loosing lot of performance while accessing (read and write) device memory. I have to access 546 + 42 elements from device memory 12 times PER THREAD :( for my current algorithm… I guess that is what is killing my application speed even , though I have like (PER THREAD) :

40*13 (funcevals)

12242*13 (42 by13 mat-vec product done one column after another 12 times)

1166*6( 6by 6 mat-mul 11 times)

flops / thread…

THAT’S A LOT OF FLOPS I KNOW , therefore I thought I should get lot of speed over cpu, but it also requires lot of data trasnfer/thread.

I guess I have to more fine grain the parallelism so that data transfer is less.

How much is the kernel launch overhead :unsure: ?

I will try multiple kernel launches to achieve this somehow.

Thanks all for listing to me External Media

NA

nitin.life · July 2, 2009, 11:41am

Also am using NVCC Version 2.0 and 177.67 driver…

Is there any major benefit in upgrading in terms of speed and all ( I am aware about new features but am not using any of the for now) ?

It will be some hassle, as am accessing the machine (for 5 days before I am back in US) via remote connection.

Thanks,

NA

Sarnath · July 2, 2009, 11:42am

Kernel launch overhead is very very very minimal.

COnsider queing kernel launches to reduce this overhead even further.

nitin.life · July 2, 2009, 12:25pm

Okay thanks will surly then do that…

wat about the driver version ?

"Also am using NVCC Version 2.0 and 177.67 driver…

Is there any major benefit in upgrading in terms of speed and all ( I am aware about new features but am not using any of the for now) ?"

Thanks…

NA

Topic		Replies	Views
GPU kernel hangs CUDA Programming and Performance	3	2930	January 29, 2009
Code hangs... CUDA Programming and Performance	24	19974	August 18, 2010
What are causes of a kernel execution hanging? CUDA Programming and Performance	0	803	May 26, 2011
Works on Emulator but not on GPU device CUDA Programming and Performance	1	1179	October 18, 2009
Inexpiable CUDA hang (NOT WDM timeout!) CUDA Programming and Performance	2	1500	June 5, 2014
kernels timeout or hang intermitently CUDA Programming and Performance	9	3782	July 25, 2013
Maximum Threads for Kernel Call CUDA Programming and Performance	38	16498	May 25, 2010
5 Second Hell Did Not Happened ?! CUDA Programming and Performance	2	4503	January 14, 2008
11sec kernel then 600us CUDA Programming and Performance	2	3819	January 24, 2008
Random execution times and freezes with concurent kernels - 2 CUDA Programming and Performance	5	2663	November 10, 2015

Kernel hangs > 10 min.. possible reasons ? then I reboot..

Related topics