stop all thread One thread any, should stop all thread w

I have n thread and m block: nxm thread. One thread any, should stop all thread when occurs a condition…
If I use a variable shared i can stop all thread of an block, but non all thread…

How can I do this?

I think you will have to do some sort of an atomic sync/halt operation. This won’t be fast.

Are you looking to save processing power when you’ve done enough computation, or are you interested in preserving the data - i.e., not overprocessing when you’ve reached the goal?

I must recovery password from a zip file, and every thread test one password, when password is correct the thread should stop all thread and should return the password

i am interesting to recovery passord in less time

This sounds very interesting. I would like to see some timings if this works.



And why do you need to stop all threads? Just feed GPU with block of passwords, say several millions of them at a time and set some flag in global memory if password is found. Then read this flag from host and if it is set (i.e. password found) just don’t issue next kernel launch. Simple. Don’t make things complex where they are not.

BTW, are you not satisfied with speed of our ARCHPR? (Not GPU-accelerated yet) :)

You might consider breaking the password space up into chunks that run for a “reasonable” (however you define the term) length of time on the device. You could then launch a new kernel for each successive chunk and stop launching kernels when one of them returns success. If you do it that way, then not having an early termination within the kernel wastes at most the runtime of a single kernel.

Alternatively, if you are using a single thread for each password to be tested, then you will have a large number of blocks in the grid. Have each block check the value of a global flag at the beginning of its execution and exit immediately if it finds it set. Otherwise run its test and set the flag (and store the result) if successful. There is no need for a sync operation because only one password test should ever be successful; thus, there is no harm in a few superfluous blocks running because they missed the flag. The cost for this scheme is the extra global memory load at the beginning of each block, plus the time required for the remaining blocks in the grid to drain after the solution is found.

You could, of course, combine the two schemes to minimize the time wasted in draining the queued blocks.


I used a flag in global memory.