Hi everyone! I am implementing the streamScan paper, TLDR, each block (using one thread), spin locks in global memory until the address that is being spin-locked updates with the value from the previous block.
I implemented the algorithm and seems to run fine, all the tests pass all the time, although I did not stress tests it properly yet. Speaking of it with a colleague of mine, I started to wonder if is there any chance that the thread is reading from global memory might read up a partial value since might happen at the same time the other thread is reading it?
Let me clarify better the algorithm, I have an intermediate array which is sized to have the same size as number of blocks, block N, spin-lock on index N-1 of this array. This array has been initialized with a specific value, for example I am using the maximum value of a uint32. When the value changes, so is not maxInt anymore, i read that value and return it and the spin-lock ends. Is there any chance that I get a partial read if read and write happens at the same time? Anyone can elaborate a little on the matter?
I am aware of race condition when writing, so you might not get correct result, if two threads are writing at the same exact time let say the value A and the other the value B to the same address in global memory I either get A or B, or is there the chance i get a bit soup of half A and half B?