How to abort the kernel from inside ? I'd kindly ask Mark Harris to check it

Say, If one thread has met the condition to abort the calculations - all other threads from all blocks in grid should also stop. How to do it ?

Here is the link on the topic with similar question with details:
http://forums.nvidia.com/index.php?showtopic=28606

However, no concrete answer has been given. Mr. Mark Harris even promised to give some pseudocode, but for some reason he did not.

The guy who asked the question has posted a number of possible answers with reasonable doubts if they are efficient, it would be greate to know the solution that won’t waste the performance.

Thanks in advcance.

have a :
shared bool need_to_stop;
if (threadIdx.x==0)
need_to_stop = false;
__syncthreads();

if some thread meets the condition : need_to_stop = true;

now in a for loop :
if (need_to_stop)
break;

shared bool need_to_stop;

global void kernel()

{

if (threadIdx.x==0)

    need_to_stop = false;

__syncthreads();

while (have_work_to_do && !need_to_stop)

{

    ............

    if (all_is_done)

        need_to_stop = true;

}

}

Something like this ?

Probably, the kernel panic code should start writing everywhere to global memory to protest its execution. The warps must be confused with conditional syncthreads. Texture fetches should always be out of bounds. Local variables should hold bigger and bigger numbers to increase the register pressure until the electrons inside the ALUs beg the warp-scheduler to not to execute the code… The compiler that compiled this code should be notified of this execution until it auto-lodges a formal complaint with Microsoft and NVIDIA.

:-)

:-)

:-)

Hehe :)

Actually, executing an invalid instruction, or reading from a non-existant global mem address, is a very quick way to terminate kernel execution. The problem is that the handling of the error isn’t very clean, and you get an annoying error message in the log.

Also, the PTX instructions TRAP and BRKPT in the manual seem to have been designed for a similar purpose. But they haven’t been exposed in the language. Could perhaps instrument some ptx yourself?