Improving division in kernel How to use log2 in a kernel?

Hi, in my kernel I have the need of making a division between 2 numbers and since this operation has a huge performance overhead then I tried to use the suggestion in the programming guide, but somehow I didn’t manage to use the log2 function correctly. This is what I tried, according to CUDA programming guide and I quote:

[codebox]Integer division and modulo operation are particularly costly and should be avoided

if possible or replaced with bitwise operations whenever possible: If n is a power of

2, (i/n) is equivalent to (i>>log2(n)) and (i%n) is equivalent to (i&(n-1));

the compiler will perform these conversions if n is literal.[/codebox]

Even so I tried to put a variable like this:

index = (i-1)>>log2((float)blockDim.y);

but of course the compiler give me this error:

error: expression must have integral or enum type

then taking in count that the last part of the suggestion is that ‘the compiler will perform these conversions if n is literal.’ I decided to restraint my kernel configuration to always be 32, and then I compile with this line:

index = (i-1)>>log2((float)32);

but of again the compiler gave me this error:

error: expression must have integral or enum type

Can anybody point me out what I’m doing wrong, I don’t see how to use the log2 function correctly.

Best regards

Lermy

The way I do it is through the __ffs (find first bit) operator.
So for example, if I have

int i = 8;
int j = 32;

And I want to do j/i, I could write it like
int temp = __ffs(i)-1; //which would give me a 3
int result = j >> temp; //right shift 3 times