Loop unrolling

thecoder3 · April 25, 2012, 6:32pm

Hi,
i have a simple code and i want to prevent the compiler from unrolling the 2 loops so i add #pragama unroll 1. I get the running time of 8seconds. Then i do unroll the inner one
by doing #praga unroll. Again i get 8 seconds. If i unroll both i get again 8 s. If i get rid of the pragmas i get the same again.

Can anyone tell me why i cant stop (prevent) compiler from unrolling. I am using CUDA 4.0 and compute capability 2.0. Card is Gefore GTX 470.

Thanks.

pasoleatis · April 25, 2012, 6:57pm

Is it possible that that the loops are too large? Can you check the number of registers used with or without the unrolling?

thecoder3 · April 25, 2012, 8:06pm

They are not. I have tested the same code on my laptop and it works. On my desktop, its not. But on my laptop i have computability 2.1 and version of CUDA 3.2. Dont know whether it makes any difference.

tera · April 25, 2012, 10:33pm

Have you confirmed the unrolling by looking at the object code, or are you only deducing this from the runtimes?

Topic		Replies	Views
Prevent the compiler from unrolling loops CUDA Programming and Performance	2	140	November 11, 2024
forcing loop unrolls CUDA Programming and Performance	4	665	October 11, 2018
CUDA #pragma CUDA Programming and Performance	1	1733	July 28, 2013
#pragma unroll not behaving as expected CUDA Programming and Performance	1	496	September 10, 2022
Loop unrolling CUDA Programming and Performance	3	6258	October 5, 2009
#pragma unroll not working? CUDA Programming and Performance	3	4898	June 8, 2009
loop unrolling CUDA Programming and Performance	7	1463	April 4, 2011
Loop unrolling CUDA Programming and Performance	1	3334	January 21, 2008
how works #pragma unroll CUDA Programming and Performance	2	787	May 25, 2015
How to improve the Perfomance of Loops CUDA Programming and Performance	1	438	March 31, 2021

Loop unrolling

Related topics