Hello,
When compiling this kernel
__global__ void kernel(int *f) {
int i;
#pragma unroll
for(i=0; i <= 8; i++) {
f[i] = 48 >> i;
if(i > 32)
__threadfence_block();
}
}
I’m getting this error:
/home/nikratio/test.cu(5): Advisory: Loop was not unrolled, unexpected call OPs
Removing the if clause fixes the problem. But why is that? Unrolling this seems really straightforward to me…
njuffa
2
The “unexpected call OP” in this case is the call to __threadfence_block().
Yes, but why is this preventing unrolling of the loop?
LSChien
4
I use cuobjdump to check assembly code, nvcc remove __threadfence_block,
and have following equivalent code
int i = 0 ;
while{
int k = 48 ;
f[0] = k >> i ;
f++ ;
i++ ;
if ( i > 8 ) break ;
}
I have no idea why nvcc cannot unroll the loop.
However if you comment
if ( i > 32 )
__threadfence_block();
then nvcc can unroll the loop.
I think the problem is __threadfence_block()
#pragma unroll
for(int i = 0 ; i <= 8 ; i++){
f[i] = 48 >> i ;
__threadfence_block();
}
cantnot be unrolled. __threadfence() has the same problem, but __syncthreads() is fine.