I was reading a little bit about the __ballot(…) function in programming guide 3.0:
Since there are only 32-bit integers being returned, that should mean that you can only have a maximum of 32 threads / warp. Now if Nvidia decides to change the warpSize in the future that should cause a problem for this function shouldn’t it? Unless they start using 64-bit integers…
If they are going to increase the warp size, it will be a device with a different Compute Capability and different set of intristic functions. It might however lead to some CUDA code portability problems…