how works #pragma unroll

Hi, someone can tell me if #pragma unroll works in this code

#pragma unroll
	for (int i = 0; i < 10; i++)
			float a = b[threadIdx.x];

I think NO. I would do the same thing more times, but i would’t do any integer operations. Only memory access.


After unrolling, you wind up with ten copies of the loop body. Due to code redundancy, the compiler will likely remove much or all of the code (is ‘a’ used anyplace else? the way the code is now, the answer seems to be “no”). You can easily check the intermediate PTX and disassemble the generated machine code with cuobjdump to observe what happens.

Thx a lot njuffa