Hi, in my kernel I have the need of making a division between 2 numbers and since this operation has a huge performance overhead then I tried to use the suggestion in the programming guide, but somehow I didn’t manage to use the log2 function correctly. This is what I tried, according to CUDA programming guide and I quote:
[codebox]Integer division and modulo operation are particularly costly and should be avoided
if possible or replaced with bitwise operations whenever possible: If n is a power of
2, (i/n) is equivalent to (i>>log2(n)) and (i%n) is equivalent to (i&(n-1));
the compiler will perform these conversions if n is literal.[/codebox]
Even so I tried to put a variable like this:
index = (i-1)>>log2((float)blockDim.y);
but of course the compiler give me this error:
error: expression must have integral or enum type
then taking in count that the last part of the suggestion is that ‘the compiler will perform these conversions if n is literal.’ I decided to restraint my kernel configuration to always be 32, and then I compile with this line:
index = (i-1)>>log2((float)32);
but of again the compiler gave me this error:
error: expression must have integral or enum type
Can anybody point me out what I’m doing wrong, I don’t see how to use the log2 function correctly.
Best regards
Lermy