From the document, if I write a=1;b=2
in device code, other threads may observe b=2
before a=1
. Therefore, we need to insert a __threadfence_XXX()
if we want to make sure the ordering is correct.
Now I want to do the same thing but in host code. I create managed memory a
and b
, and in the host code I set *a=1;*b=2;
. Is it guaranteed that in the device code all threads observe *b=2
before *a=1
? If not, how can I insert a fence?