I was thinking of doing some language work, and was wondering if there’s a way to make a __syncthreads() divergence crash. At least for me, a simple [font=“Courier New”]if (threadIdx.x < half of threads) { syncthreads }[/font] didn’t cause a crash (not sure what it does); making it more complicated by branching on an input variable didn’t change the effect.
I experimented changing one of my longer kernels – one that does a 1-d wavelet transform on the x or y dimension of an image; inserting a syncthreads() [none were removed] for threads loading every-other element caused weird visual errors, but no reported kernel crashes (I’m using pycuda, which will call error checking functions automatically; and there are several kernels after this one which didn’t crash).
Thanks in advance,
Nicholas
I don’t think
if (threadIdx.x < half of threads) { syncthreads }
will always result in a crash (but I’m a newbie).
This is because I think that if no instructions follow the “if (threadIdx.x < half of threads) { syncthreads }” then the thread syncs automatically (when returning for a given thread that thread is always sync).
So as far as I think, this would be just ok:
global dontcrash() {
if (threadIdx.x < blockDim.x / 2) {
// …
__syncthreads();
// …
__syncthreads();
//…
}
}
since the 2th half of the threads would just return and be always sync.
But that is a question I was wondering to ask a while ago, so maybe someone with more experience can answer that (just start CUDA a few days ago).
The behavior isn’t necessarily “oh it will set your computer on fire” or anything like that–it’s completely undefined. Maybe it will, maybe it won’t. Don’t ever do it because it will crash at some point.
noel: I did create a more complicated example with data dependency and no ability to predict whether a thread would execute __syncthreads() or not. This still did not cause a crash. Additionally, I think I remember somewhere that each syncthreads has some identifying parameter, so one can’t do “if (a) { code; syncthreads; } else { code; syncthreads; }”.
tmurray: [badly] out-of-bounds accesses and infinite loops definitely cause machine hangups for me. I want to know what things are good targets for languages work (e.g. making a runtime library that would log thread presence at every syncthreads). The undefined behavior should at least result in an error message, though.