can anybody explain warp vote functions

Hi, the only information the CUDA Programming Guide gives about warp vote functions

is their signature, eg.

[codebox]int __all(int predicate);[/codebox]

But what is the semantics of this function ? what is predicate ? what happens if

a thread calls __all() ???

Greetings, Uwe

I believe it can be called like this (for example)

int val = __all(threadIdx.x < 112)

val will be 1 if all threads of the warp this thread is in are < 112

if one of the threads in this warp will have threadIdx 112 or higher, it will return 0

So threads:

0 - 31 : 1

32 - 63 : 1

64 - 95 : 1

96 - 127 : 0

128 - 159 : 0

This can be used to minimize warp divergence as far as I understood.

Thanks, now I got it.


So, Does this mean that “_all” is an execution barrier like __syncthreads() ??

No. The voting is warp-wide, and all threads within a warp are (by definition) synced anyway so no barrier is needed.

Thank you Steve…!

While we are on the topic of warp voting functions, is there any information about the latency or the number of instructions that this instruction translates to? (e.g. is it an expensive function or anything like that?)

Is there any example of warp vote function in SDK. Basically I would like to know how it can be used to minimize warp divergence…

I guess a warp vote function will lead to a dead lock if it is used in a diverged warp, e.g.

if(threadIdx.x % warpSize == 0) {

if(__any(…)) …

} else {


Could anyone from Nvidia explain this topic?

Please ignore my previous post.
I found this thread much useful: