Hi,
I’m looking for a good way to detect local minima/maxima.
Let’s say that each thread has five float values in shared memory. I think I should be able to use bitwise operators to check for min/max, and that it could be fast.
bool min = false;
float threadArr[5];
fillArray(threadArr, globalMem);
float center = threadArr[2];
min = (center < threadArr[0]) & (center < threadArr[1]) & (center < threadArr[3]) & (center < threadArr[4]);
My idea is that code like this would be better than logical && operators, because the bit operators don’t cause branching:
min = (center < threadArr[0]) && (center < threadArr[1]) && (center < threadArr[3]) && (center < threadArr[4]);
-
Am I crazy?
-
Do I need parentheses around the bitwise &'s?
-
Are the logical operators && and || short-circuiting in CUDA?