but I bet that won’t be very efficient… Is there any built-in function to force to stop the kernel execution?
Any idea how to implement this better pls? Do I need just to divide into smaller blocks and test result block by block instead ( like is done in the 1dhaar example? )
Could you include a basic parallelized raycaster in the SDK examples?
Use a shared memory variable for “hitFound”. Use thread 0 to initialize it to false.
Threads should ONLY write to it if they find a hit; if they do, they write true. After the write to it, __syncthreads(). Then have all threads check it and return if it is true, or continue if it is not.
Even though multiple threads will be writing the variable at once, the only possible value that they are writing is “true”, so if it is ever true, you know all threads should quit.
Let me know if this isn’t clear and I will write some pseudocode.
shared data has a lifetime of a launch, hence you cannot use a launch to initializa data used in a subsequent launch. I think what Mark means is something like:
__shared__ int x;
__global__ void f(void)
{
if (threadIdx.x == 0) {
x = 0;
}
__syncthreads();
// overhere, every thread can access x
}
But will be efficient? Threads in the block will continue executing the “if ( 0==threadIdx.x )” and the “if ( !hitFound )” for the 6M triangles… that just will skip some maths in the interior ray-triangle test branch… With a built-in intruction to abort the current executing kernel(abort all the threads in block, abort all the blocks) could be more efficient?