How can one thread to stop immediately the kernels work

Lila · August 11, 2013, 12:20pm

Hello,

I have a case when all the threads are looking for some data. As soon as one thread found it, there is no meaning to continue running the other threads. Is there a possibility to stop the kernel run by the tread that found the data at the moment of finding the data?

If not, is there a workaround to shorten the kernel work in this case?

Thanks, Lila

JFSebastian · August 13, 2013, 7:11pm

When you launch a kernel, you cannot physically “kill” other running kernels, if this is what you are meaning. A workaround to avoid that the “unuseful” kernels continue computing would be to define a flag and, based on this flag, to deny the corresponding thread to make computations. Of course, in this case you could incur in branch divergence.

seibert · August 13, 2013, 7:30pm

The global flag approach is also the only option I’m aware of.

SPWorley · August 13, 2013, 10:24pm

While not really recommended, you can use the TRAP operation in PTX to abort a kernel.
I haven’t used this myself, and I’m not sure if it is inefficient or has side effects like aborting the whole stream queue.

asm("TRAP;");

allanmac · August 13, 2013, 11:06pm

seibert · August 13, 2013, 11:41pm

I think this is the first GIF meme in the history of the CUDA forums. :)

sBc-Random · August 14, 2013, 6:04am

One way to try it :

volatile int *flag;   //value pointed to by flag should be 0 at the start should be 0 at start of execution
for (conds) {
    if (flag[0]==0) {
        lookfordata()
        if (founddata) atomicAdd(flag,1);
    }
    else return;
}

SPWorley · August 14, 2013, 6:11am

sBc, probably you’d need a volatile keyword in the flag definition.

sBc-Random · August 14, 2013, 7:57am

thanks :)

Jimmy_Pettersson · August 14, 2013, 9:24am

:D

shares with colleagues

JFSebastian · August 14, 2013, 8:48pm

Perhaps

asm("TRAP;");

is not really what the user was meaning. My understanding is that he/she does not want to abort an entire kernel, but “kill” only the threads that have found the datum. On the other side, the code suggested by sBc-Random translate very well to practice the flag idea.

seibert · August 14, 2013, 8:57pm

JFSebastian:

Perhaps
asm("TRAP;");
is not really what the user was meaning. My understanding is that he/she does not want to abort an entire kernel, but “kill” only the threads that have found the datum. On the other side, the code suggested by sBc-Random translate very well to practice the flag idea.

My interpretation of the text is to abort the kernel. Perhaps the original poster can clarify? :)

JFSebastian · August 14, 2013, 9:12pm

Yes, seibert. “stopping the kernel run by a thread” is somewhat ambiguous. But perhaps the poster does not need to provide clarifications, as there have been found solutions to the two possible interpretation of his/her post :-)

I wonder if there is a “higher level” alternative to the PTX instruction to abort an entire kernel.

JFSebastian · August 17, 2013, 8:45pm

I just found this thread

[url]cuda - how can a __global__ function RETURN a value or BREAK out like C/C++ does - Stack Overflow

Perhaps it could be of interest.