CUDA kernel slow/ times out when applying values to results array

sharc · April 17, 2015, 4:24pm

Hi

I am new to CUDA and have a question which maybe one of you guys can help with.

I am basically doing a tri-tri intersection using CUDA.

In the kernel code, a triangle searches a list of other triangles and gets the closest one
based on centre to centre distance.

Say, the closest distance calculated is updated in the loop to be a float called mgdist.

If I set the results array, say d_close[i] = mgdist, at the end of the loop the code runs really slow.
If I set it to be say the first node of the triangle it points to eg d_close[i]=node0[i]
then it is fine.

Is there anything obvious that rings a bell with anyone ?

Surely the compiler is not that clever that it doesnt bother with the loop if mgdist is not used by a
device result array ??

Cheers Guys

njuffa · April 17, 2015, 4:41pm

The CUDA compiler will aggressively eliminate dead code, that is, all computation that does not ultimately contribute to data written to global memory. From your description, it sounds like this is what is causing the behavior you observe. Impossible to know for sure without having buildable and runnable code to reproduce it. You can disassemble the generated binary with cuobjdump --dump-sass to find out what happens to the code for each of the two variants.

Topic		Replies	Views
Extremely long delay to affect a variable stored into global memory CUDA Programming and Performance	3	1687	April 4, 2011
Too big delay in code, problem CUDA Programming and Performance	3	955	October 22, 2009
Strange behaviour in CUDA CUDA Programming and Performance	2	4045	September 15, 2009
Incosistent results - can't explain CUDA Programming and Performance	18	3230	May 10, 2010
Performance with memory assigment CUDA Programming and Performance	4	1956	March 7, 2009
stymied by my first cuda simple test, need help! CUDA Programming and Performance	4	3937	August 27, 2011
Uint64_t compare operation slows down device code dramatically CUDA Programming and Performance cuda , performance	7	1656	October 12, 2021
Kernel execution takes AGES CUDA Programming and Performance	7	3054	March 28, 2012
Strange performance issue performance CUDA Programming and Performance	5	1051	November 28, 2011
slowdown modifying kernel return variable CUDA Programming and Performance	2	2634	December 28, 2009

CUDA kernel slow/ times out when applying values to results array

Related topics