Interuptability of threads

I was using this Bitonic sort https://gist.github.com/mre/1392067

But I was getting Illegal memory address on the swap.

    /* exchange(i,ixj); */
    float temp = dev_values[i];
    dev_values[i] = dev_values[ixj]; //Illegal memory address was happening here
    dev_values[ixj] = temp;

So instead I moved the fetches to the top (line 52) and then used those for the exchange. Now cuda-memcheck says no errors.

My question: Is one thread interruptable? If it is, then this could explain why the swap might fail (i.e. the swap is not atomic).

Is there a better way to perform a global memory swap that is atomic without killing the performance? (I have read about compare and swap, but in this case I want the entire exchange of two elements to be atomic).

Where can I read more computer science on how the GPU schedules and interrupts?

Thank you