Background: I’m trying to use the GPU to scan CRC polynomials for certain properties and I’m trying to make the code faster. I don’t need any strict synchronization between threads/warps until the end of the kernel. That being said, blocks can execute for a long time and the odds of any given block making it the end is very rare; so the expectation is that one thread will eventually find a reason to exit the whole block. So given this scenario:
- Is it possible to query whether any threads have exited without forcing a synchronization stall? Basically: “if any thread has exited, then exit this thread”
- Alternatively, is there any way for one thread to force all of the other threads to exit?
I’ve tried simply putting a volatile boolean in shared memory and having each thread poll that during loops; and that seems to work fine, but I’m wondering if there is something faster.