Controlling of block execution


I have different results of a cuda kernel in emulation and normal mode. I think, it’s because of the order of block execution. Is there any way to control that?

Thanks in advance!


From a user point of view, block execution order is totally non-deterministic. If it absolutely matters to you, your only alternatives are to serialize the order critical parts of your code with atomic memory access, or split the algorithm so that the order critical part of the code can be run as a series of individual kernel launches in an order you can determine using host CPU code.

Yes, atomic functions will help. Thanks :thumbup: