what I’ve got right now is a multithreaded kernel and another one, that executes only one instruction X in a single thread. I’d like to merge them, however somthing like
if ( condition that is only true in one of the threads )
is leading to strange results. The performance guidelines deprecate control flow instructions, but in this case i would prefer them over using a second kernel. I wonder why it doesn’t work.
Another (related question): I’m not sure what exactly happens, if I do
shared int c;
in a kernel with n threads. How much will c be increased?
I would be grateful for any explanations or reading suggestions, I didn’t find exact specifcation about this in the programing guidelines.