I’ve got an issue I’ve been banging my head against for a couple of days, and I’m reaching the point of looking to CUDA to be at fault.
Reason being I have done a printf of each threads ID, plus the value of a shared variable, directly after a __syncthreads() call, and one thread has a different value from the others.
How can this be possible if __syncthreads is supposed to update all threads with the latest value in shared memory??
I’ve also tried making the shared boolean value ‘volatile’ too, but without success.
Also note the incorrect usage of __syncthreads error is because thread 9 is entering a branch with a __syncthreads() because its dataToLoad value is 1.
Heres the code, and its output:
shared bool dataToLoad;
Thread: 0 dataLoad:0
Thread: 1 dataLoad:0
Thread: 2 dataLoad:0
Thread: 3 dataLoad:0
Thread: 4 dataLoad:0
Thread: 5 dataLoad:0
Thread: 6 dataLoad:0
Thread: 7 dataLoad:0
Thread: 8 dataLoad:0
Thread: 9 dataLoad:1
Error:incorrect use of __syncthreads()
From nvidia programming guide, p.21 :
“Only after the execution of a __syncthreads() (Section 4.4.2) are writes to
shared variables guaranteed to be visible by other threads.”
As far as i can see, this should be impossible, I don’t see how any code or errors on my behalf could cause the above output. A shared variable should have the same value amongst all threads after a __syncthreads() call…
ps. I can’t paste the whole code since this is work for a company, top secret hush hush stuff ;)