I thought I understood the function of __threadfence_block() but the documentation is throwing me for a loop.
Specifically B.5 states that in the following code where thread 1 calls writeXY and thread2 calls readXY (and both are in the same block)
device volatile int X = 1, Y = 2;
device void writeXY()
X = 10;
Y = 20;
device void readXY()
int A = X;
int B = Y;
that for thread 2, “A will always be 10 if B is 20”. But I don’t see how this is guaranteed to be true given my understanding of how threadfences work. Specifically, it should be that these threadfences ensure that Y=20 comes after X=10 and that int B=Y comes after int A=X. This doesn’t guarantee that the following order could not occur:
leaving A=1, B=20. Again all threads still observe that Y=20 follows X=10, and the read statement int B=Y does indeed follow int A=X What am I missing?
I tried to reproduce what was happening by actually running the example, but I got a deterministic result even without __threadfence_block()s inserted.
Thanks for your help!