i am having a problem. My kernel has a loop which runs for say 20 times. But for most of the cases I need to break much before 20 times and hence dont need to execute other kernels. But when i use break statement all the values loaded in shared memory becomes wrong and hence i cannot use break statement.
MY question is what will happen if there is a break statement in a loop in cuda kernel. I can understand that it will cause divergence but why should this affect the final output in terms of results.
Please Help me explain this.