Controlling of block execution


I have different results of a cuda kernel in in emulation and normal mode. I think, it’s because of order of block execution. Is there any way to control that?

Thanks in advance!


Yes - fix your program so that it doesn’t care about the order of block execution. This is explicitly said to be undefined in the programming manual. You can perform some interblock synchronisation using atomics, but it’s not reccommended, and would be really easy to cause deadlocks.